4

I'm having an issue that I'm struggling to understand. I have a HTML with a ul list with a bunch of li items, and within those li items they have elements that I would like to scrape using Selenium on Python. The website makeup is as follows

<ul id="results">
    <li>
        <div class="name">John</div>
        <div class="age">23</div>
    </li>
    <li>
        <div class="name">Bob</div>
        <div class="age">39</div>
    </li>
    ..... #more li
</ul>

So this seems like a pretty simple problem where we just save the li elements as a variable, and iterate through each list to save the information. The problem is that no matter what I do, my results always gives back the first list item over and over again. It will loop through the correct number of elements, but always refer back to the first one. So if I do the following

results = driver.find_elements_by_xpath("""//*[@id="results"]/li""")
for result in results:
    name = result.find_element_by_xpath("""//*[@class="name"]""").text
    print(name)

Now if there is 10 li elements in this particular case, the name "John" will just print out 10 times rather than updating based on the iterated list.

1
  • Try doing find_element then find_elements. (note plural) There's one results list and multiple names. Also double check your xpath. Commented Jan 9, 2018 at 7:02

1 Answer 1

8

Your XPath for the 2nd search is incorrect. It must begin with .. Otherwise, it will start searching from the top. That's the reason why it always find the first item. See my example below.

results = driver.find_elements_by_xpath('//*[@id="results"]/li')
for result in results:
    name = result.find_element_by_xpath('.//*[@class="name"]').text
    print(name)
Sign up to request clarification or add additional context in comments.

4 Comments

That's the one! Thank you so much for your assistance. The missing dot was all the difference.
so to clarify the "." would start searching from the previous result or how does it work exactly?
@RyanN. Let says . is the current node that is processing. Your loop changes the variable result to the new <li>, if you don't begin the expression with ., the // in the XPath expression will always start searching from the top of the DOM.
gotcha. Once again thank you very much for the answer and explanation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.