Using Selenium css selector to extract data

Question

Hello I did this code that returns to me a list of li , but I want to access to each a tag mentioned inside and open it , if you have any recommandation I would be very grateful

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import pandas as pd
import time

options = Options()

# Creating our dictionary
all_services = pd.DataFrame(columns=['Motif', 'Description'])

path = "C:/Users/Al4D1N/Documents/ChromeDriver_webscraping/chromedriver.exe"
driver = webdriver.Chrome(options=options, executable_path=path)

driver.get("https://www.mairie.net/national/acte-naissance.htm#plus")

list_of_services = driver.find_elements_by_css_selector(".list-images li")

I know that I need to iterate in each list_of_services Item , but I don't know how can I open each a tag since they all don't have classes or ids that can help me to make difference between them

What do you mean by open each a tag? Do you need the actual hrefs or only the titles? — Mitchell Olislagers
– Mitchell Olislagers, Commented Jan 18, 2021 at 0:54
Actually if you visit the website you will see down in the page a second section that have multiple hrefs , and I need to open every <a> tag there . and in every link opened I have to do extract specific data — Aladin
– Aladin, Commented Jan 18, 2021 at 1:04

Mitchell Olislagers · Accepted Answer · 2021-01-18 01:16:21Z

1

This is one way to extract all of the links within the hrefs.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import pandas as pd
import time

options = Options()

# Creating our dictionary
all_services = pd.DataFrame(columns=['Motif', 'Description'])

path = "C:/Users/Al4D1N/Documents/ChromeDriver_webscraping/chromedriver.exe"
driver = webdriver.Chrome(options=options, executable_path=path)

driver.get("https://www.mairie.net/national/acte-naissance.htm#plus")

#Get all elements in class 'list-images'
list_of_services = driver.find_elements_by_class_name("list-images")

for service in list_of_services:
    #In each element, select the atags
    atags = service.find_elements_by_css_selector('a')
    for atag in atags:
        #In each atag, select the href
        href = atag.get_attribute('href')

Output:

https://www.mairie.net/national/acte-mariage.htm#acte-naissance
https://www.mairie.net/national/acte-deces.htm#acte-naissance
https://www.mairie.net/national/carte-identite.htm#acte-naissance
https://www.mairie.net/national/passeport.htm#acte-naissance
https://www.mairie.net/national/casier-judiciaire.htm#acte-naissance
https://www.mairie.net/national/demande-carte-electorale.htm#acte-naissance
https://www.mairie.net/national/cadastre-plu.htm#acte-naissance
https://www.mairie.net/national/carte-grise-en-ligne-par-internet.htm#acte-naissance
https://www.mairie.net/national/certificat-non-gage.htm#acte-naissance
https://www.mairie.net/national/permis-conduire-delivrance.htm#acte-naissance
https://www.mairie.net/national/changement-adresse.htm#acte-naissance

answered Jan 18, 2021 at 1:16

Mitchell Olislagers

1,8271 gold badge6 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Aladin Over a year ago

and what If I want to open each href and try to extrac specific data for each one , for example I go to the first link shown there and I do some find_elements method , is there any alternatif for that ? or I need to open again with driver.get(href) ?

Mitchell Olislagers Over a year ago

Yes, I think the best way would be to store all hrefs in a list and then loop through that list and call driver.get(href) or alternatively open each href in a new tab. See stackoverflow.com/questions/42814991/… how to handle different tabs.

Mitchell Olislagers Over a year ago

Sorry I don't understand the question. Can you rephrase?

Aladin Over a year ago

ah I didnt see your stack link , it's okay I have the solution now , thank you

Collectives™ on Stack Overflow

Using Selenium css selector to extract data

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related