Python Selenium unable to find table element by xpath

Question

Here is what the table looks like on the web page (it's just one column):

Here is the HTML of the table I am trying to scrape:

If it matters, that table is nested within another table.

Here is my code:

    def filter_changed_records():
        # Scrape webpage for addresses from table of changed properties
        row_number = 0
        results_frame = locate_element(
            '//*[@id="oGridFrame"]'
        )
        driver.switch_to.frame(results_frame)
        while True:
            try:
                address = locate_element("id('row" + str(row_number) +
                                         "FC')/x:td")
                print(address)
                changed_addresses.append(address)
                row_number += 1
            except:
                print("No more addresses to add.")
                break

As you can see, there is a <tr> tag with an id of row0FC. This table is dynamically generated, and each new <tr> gets an id with a increasing number: row0FC, row1FC, row2FC etc. That is how I planned on iterating through all the entries and adding them to a list.

My locate_element function is the following:

    def locate_element(path):
        element = WebDriverWait(driver, 50).until(
            EC.presence_of_element_located((By.XPATH, path)))
        return element

It always times out after 50 seconds from not finding the element. Unsure of how to proceed. Is there a better way of locating the element?

SOLUTION BY ANDERSSON

address = locate_element("//tr[@id='row%sFC']/td" % row_number).text

Andersson · Accepted Answer · 2017-07-07 19:10:40Z

3

Your XPath seem to be incorrect.

Try below:

address = locate_element("//tr[@id='row%sFC']/td" % row_number)

Also note that address is a WebElement. If you want to get its text content, you should use

address = locate_element("//tr[@id='row%sFC']/td" % row_number).text

edited Jul 7, 2017 at 19:10

answered Jul 7, 2017 at 17:12

Andersson

52.8k18 gold badges83 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

Zzz Over a year ago

No luck, sadly. It still can't find it. Does the xpath also need to route through the parent table, or would that not effect it?

Andersson Over a year ago

Can you check whether your table located inside an iframe. Also add HTML for the same as text, not as image

Zzz Over a year ago

It is in a frame. I left that part out, edited original code in post with those lines.

Andersson Over a year ago

Did you miss return element in your locate_element() definition?

Zzz Over a year ago

Somehow that bit got cut off. Yeah, I have that. locate_element() works fine for tons of other stuff, so that bit isn't the issue.

|

jlaur · Accepted Answer · 2017-07-07 17:21:00Z

-1

Parsing html with selenium is slow. I would use BeautifulSoup for that.

Suppose you have loaded the page in driver it would be something like:

from bs4 import BeautifulSoup
....

soup = BeautifulSoup(driver.page_source, "html.parser")
td_list = soup.findAll('td')
for td in td_list:
    try:
        addr = td['title']
        print(addr)
    except:
        pass

answered Jul 7, 2017 at 17:21

jlaur

7406 silver badges13 bronze badges

5 Comments

Zzz Over a year ago

Is the difference in speed big enough to justify migrating my entire script to it? It's about 500 lines of selenium so I don't want to spend the time switching to beautifulsoup if it isn't a huge difference.

jlaur Over a year ago

That depend on the amount of pages your getting info from and how many element you're using selenium to grab. If it's a one off and time is not important, stick to selenium. In future project parse the code with something else if speed is important...

jlaur Over a year ago

I just did a speed test. Setup was as follows. I used selenium to pull data from the white pages. 1 page that has 100 hits. Each hit contains in a result block which holds both name, address and phone number. I did 10 loops for both selenium and BeautifulSoup (html.parser) where I extracted both name, address and phone number (3 find-commands) - which for both BeautifulSoup and Selenium sums up to 3010 find-commands (10 loops * 100 person * 3 + 10*1 find result block-command) in total. With soup the total time is 13 seconds. For selenium 165 seconds - which makes soup about 12 times faster.

Zzz Over a year ago

Wow. That's awesome. I think I'll migrate

jlaur Over a year ago

and if you feed soup only the part of the html, that holds your data (which I did in the example) your speed improves comparing to what I did in my answer about (where I fed soup with driver.page_source). This is done like so: container = driver.find_element_by_css_selector('div.relevant.section').get_attribute("outerHTML") and then feeding only the container object to BeautifulSoup instead of the whole html page.

Collectives™ on Stack Overflow

Python Selenium unable to find table element by xpath

2 Answers 2

12 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

12 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related