0

I have the following html body that gives a list of elements. Please keep in mind that this html is just for demonstration. In the actual body, the list contains more than 20 properties.

<dl>
   <dt class="sc-ellipsis">Merk</dt>
   <dd>
       <a href="https://www.autoscout24.nl/auto/audi/">Audi</a>
   </dd>
   <dt class="sc-ellipsis">Model</dt>
   <dd>
       <a href="/lst/audi/q3">Q3</a>
   </dd> ....more properties like that
</dl>

I would like to get the words: Audi and Q3

I can simply do this in Selenium:

browser.find_elements_by_css_selector('dd')[0].text # to get Audi
browser.find_elements_by_css_selector('dd')[1].text # to get Q3

BUT sometimes some of the elements might be missing, therefore I can not rely on the position mentioned above. For example if Audi is missing, then this:

browser.find_elements_by_css_selector('dd')[0].text # now it returns Q3

returns Q3. One common pattern is that Audi will always follow Merk and Q3 will always follow Model . Namely, if Merk is not in the html body Audi won't be either. What I tried is to find the very next html element of Merk:

WebDriverWait(browser, 10).until(EC.visibility_of_all_elements_located((By.XPATH, './/[(@class="sc-ellipsis") and (text()="Merk")]/following-sibling::dd')))[0].text

But this returns an empty list which means it didn't find Audi. Does anyone know how to get the next element of Merk (or Model or whatever comes next in the list) ? I can create a catcher myself, so if Merk is not part of the list, then don't try to get the next element.

2
  • I think your XPath is looking for the following sibling with a p tag, when in reality you want to look for the following sibling with a dd tag. Commented Apr 25, 2020 at 13:21
  • Can you provide the URL of the page? Commented Apr 25, 2020 at 13:29

1 Answer 1

3

The following code will return the text of the dd following the dt with text "Merk"

from selenium import webdriver
browser = webdriver.Chrome()
browser.get('https://www.autoscout24.nl/aanbod/audi-q3-sportback-pro-line-business-35-tfsi-110-kw-150-p-benzine-zilver-757ef256-c967-457b-8db1-4cb8b287c311?cldtidx=19')
elem = browser.find_element_by_xpath('//dt[text()="Merk"]/following-sibling::dd')
print(elem.text)

After examining your code, it seems the only issue was that you were not stating the tag type of the first tag. Either use wildcard, or dt.

'.//*[(@class="sc-ellipsis") and (text()="Merk")]/following-sibling::dd'
'.//dt[(@class="sc-ellipsis") and (text()="Merk")]/following-sibling::dd'
Sign up to request clarification or add additional context in comments.

1 Comment

The class element is not the problem. You did not reference a tag type for the first tag, or specify wildcard. So Xpath did not know what type of tag to look for in the first place before finding its sibling

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.