0

Do you see something wrong with this setup?

(selenium, etc. imported earlier on)

It iterates through table_rows until it finds the first row where the “try” is successful, then comes back from the getinfo() function ran from the “try” (which clicks a link, goes to a page, gets info, and then clicks the back button back to the original page), and then keeps iterating through the rest of table_rows.

The correct number of table_rows iterations are performed by the end, and the “try” function is being triggered again (the print() before current_cell works), but the find_element_by_class doesn’t seem to be picking up any more “a0” in the subsequent table_rows iterations, even though there are definitely some there that should be being found (the print() after current_cell never prints after the very first time).

Thank you for your help. I'm new to coding and have learned a ton, but this is stumping me.

def getinfo(current_cell):
    link_in_current_cell = current_cell.find_element_by_tag_name("a")
    link_in_current_cell.click()

    waitfortable2 = WebDriverWait(driver, 5).until(
        EC.presence_of_element_located((By.CLASS_NAME, "top-edit-table"))
        ) 
    
    print("Here is the info about a0.")

    driver.back()

    return True


for row in table_rows:
    print("got the row!")   
        
    waitfortable = WebDriverWait(driver, 5).until(
        EC.presence_of_element_located((By.XPATH, "/html/body/table[3]/tbody/tr/td[2]/form/table/tbody/tr/td/table/tbody/tr[4]/td/table/tbody/tr[1]/td[1]"))
        ) 

    try:
        print("we're trying!")
        current_cell = row.find_element_by_class_name("a0")
        print("we got an a0!")
        getinfo(current_cell)
    except:
        print("Not an a0 cell.")
        pass

    continue

Here is more of the code from before "for row in table_rows:" if you need it, but I don't think that part is an issue, as it is still iterating through the rest of table_rows after it finds the first "a0" cell.

try:
    WebDriverWait(driver, 5).until(
        EC.presence_of_element_located((By.XPATH, "/html/body/table[3]/tbody/tr/td[2]/form/table/tbody/tr/td/table/tbody/tr[4]/td/table/tbody/tr[1]/td[1]"))
        ) 

    table = driver.find_element_by_xpath("/html/body/table[3]/tbody/tr/td[2]/form/table/tbody/tr/td/table/tbody/tr[4]/td/table")
    table_rows = table.find_elements_by_tag_name("tr") 

    for row in table_rows:
        print("got the row!") 
        ....
        ....
        ....(see code box above) 
6
  • 1
    It is hard to say without seeing the rest of your code or the site you are scraping. Commented Feb 8, 2021 at 4:26
  • @goalie1998 I added my function getinfo() code. Does that help? (I didn't know what to return from it, so I returned True.) Commented Feb 8, 2021 at 4:51
  • What is the site? Commented Feb 8, 2021 at 5:34
  • it requires login credentials to get to. Nothing inherently wrong with the code I've written, then? Commented Feb 8, 2021 at 6:22
  • The only thing I can imagine is that things change when you navigate away from then back to the page. It's hard to do any real debugging with a bare except clause. I would not run getinfo(current_cell) for now, and either change or remove the try/except to see where your errors are coming from. Commented Feb 8, 2021 at 6:30

1 Answer 1

0

Success! I found my own workaround!

FYI: I still could not see anything wrong with my existing code, as it correctly found all the a0 cells when I commented out the function part of that text (#getinfo(current_cell)... thank you @goalie1998 for the suggestion). And, I didn't change anything in that function for this new workaround, which works correctly. So, it must have something to do with Selenium getting messed up when trying to iterate through a loop that (1) tries to find_element_by something on the page (that exists multiple times on the page, and that's why you're creating the loop) and (2) clicks on a link within that loop, goes to a page, goes back to a page, and then is supposed to keep running through the iterations with the find_element_by "function" (probably wrong term usage here) to get the next one that exists on the page. Not sure why Selenium gets messed up from that, but it does. (More experienced coders, feel free to elaborate).

Anyway, my workaround thought process, which may help some of you solve this issue for yourselves by doing something similarly, is:

(1) Find all of the links BEFORE clicking on any of them (and create a list of those links)

Instead of trying to find & click the links one-at-a-time as they show up, I decided to find all of them FIRST (BEFORE clicking on them). I changed the above code to this:

# this is where I'm storing all the links
text_link_list = []

for row in table_rows:

    print("got the row!")
        
    waitfortable = WebDriverWait(driver, 5).until(
        EC.presence_of_element_located((By.XPATH, "/html/body/table[3]/tbody/tr/td[2]/form/table/tbody/tr/td/table/tbody/tr[4]/td/table/tbody/tr[1]/td[1]"))
         ) 
        
    ## Get a0
    try:
        print("we're trying!")
        row.find_element_by_class_name("a0")
        print("we got an a0!")
        
        # this next part is just because there are also blank "a0" cells without 
        # text (aka a link) in them, and I don't care about those ones.

        current_row_has_a0 = row.find_element_by_class_name("a0")
        if str(current_row_has_a0.text) != "":
            text_link_list += [current_row_has_a0.text]
            print("text added to text_link_list!")
        else:
            print("wasn't a text cell!")
     except:
         pass

     continue

(2) Iterate through that list of links, running your Selenium code that includes .click() and .back()

Now that I had my list of links, I could just iterate through that and do my .click() —> perform actions —> .back() function that I created ( getinfo() -- original code in question above).

## brand new for loop, after "for row in table_rows" loop

for text in text_link_list:
    # waiting for page to load upon iterations
    table = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.XPATH, "/html/body/table[3]/tbody/tr/td[2]/form/table/tbody/tr/td/table/tbody/tr[4]/td/table/tbody/tr[1]/td[1]"))
        ) 
    
    # this is my .click() --> perform actions --> .back() function
    getinfo(text)

However, I just needed to make two small changes to my .getinfo() function.

One, I was now clicking on the links via their "link text", not the a0 class I was using before (need to use .find_element_by_link_text).

Two, I could now use my more basic driver.find_element_by instead of my original table.find_element_by ...."table" may have worked as well, but I was worried about the memory of getting to "table" being lost since I was now in my function running the .click() code. I decided to go with "driver" since it was more certain. (I'm still pretty new to coding, this may not have been necessary.)

def getinfo(text):
    link_in_current_cell = driver.find_element_by_link_text(text)
    link_in_current_cell.click()

    waitfortable2 = WebDriverWait(driver, 5).until(
        EC.presence_of_element_located((By.CLASS_NAME, "top-edit-table"))
        ) 
    
    print("Here is the info from this temporary page.")

    driver.back()

    return True

I hope that this can all be helpful to someone. I was stoked when I did it and it worked! Let me know if it helped you! <3

PS. NOTE IF HAVING STALE ERRORS / StaleElementReferenceException: If you are iterating through your loops and clicking a link via something like driver.find_element_by (instead of using .back()), you may run into a Stale error, especially if you're just trying to use .click() on a variable that you assigned earlier or outside of the loop. You can fix this (maybe not in the most beautiful way) by redefining your variable right at that point of the loop when you're wanting to click on the link, with code like this:

my_link = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.LINK_TEXT, "linktext"))
    )
my_link.click()
continue
  • "continue" is only necessary if you're wanting to end the current iteration and begin the next one, and is also not likely necessary if this is at the end of your loop
  • you can change that red "10" number to whatever amount of time you'd like to give the code to find the element (aka the page to reload, most likely) before the script fails
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.