Cannot get_attribute('href') from element via Selenium

Question

I've been stuck at this for eons now... Can you please help?

Trying to build a scraper that scrapes listings on this website and I just cannot for the life of me get the URL of each listing. Can you please help?

I've tried numerous ways to locate the element, this latest one is by the absolute XPath (by class always failed as well)

The code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pandas as pd
import time

PATH = "/Users/csongordoma/Documents/chromedriver"
driver = webdriver.Chrome(PATH)
driver.get('https://ingatlan.com/lista/elado+lakas+budapest')

data = {}
df = pd.DataFrame(columns=['Price', 'Address', 'Size', 'Rooms', 'URL'])

listings = driver.find_elements_by_css_selector('div.listing__card')
for listing in listings:
    data['Price'] = listing.find_elements_by_css_selector('div.price')[0].text
    data['Address'] = listing.find_elements_by_css_selector('div.listing__address')[0].text
#    data['Size'] = listing.find_elements_by_css_selector('div.listing__parameter listing__data--area-size')[0].text
    data['URL'] = listing.find_elements_by_xpath('/html[1]/body[1]/div[1]/div[2]/div[4]/div[1]/main[1]/div[1]/div[1]/div[1]/a[3]')[0].text
    df = df.append(data, ignore_index=True)

print(len(listings))
print(data)

#   driver.find_element_by_xpath("//a[. = 'Következő oldal']").click()

driver.quit()

The error message:

Traceback (most recent call last):
  File "hello.py", line 18, in <module>
    data['URL'] = listing.find_elements_by_xpath('/html[1]/body[1]/div[1]/div[2]/div[4]/div[1]/main[1]/div[1]/div[1]/div[1]/a[3]')[0].text
IndexError: list index out of range

Many thanks!

It seems to be the second a[2] of listing and not a[3]. Also use a relative path and not an absolute xpath. Then use get_attribute('href') instead of text. — Arundeep Chohan
– Arundeep Chohan, Commented Nov 26, 2020 at 23:52
your find_elements is returning no matching elements. Fix the xpath. — DMart
– DMart, Commented Nov 27, 2020 at 3:18

Arundeep Chohan · Accepted Answer · 2020-11-27 00:10:59Z

1

Something like the below would work. To get a webelement of a[2] from an element and it's href.

data['URL'] = listing.find_element_by_xpath('//a[2]').get_attribute('href')

answered Nov 27, 2020 at 0:10

Arundeep Chohan

9,9895 gold badges17 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Cannot get_attribute('href') from element via Selenium

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related