Python/Selenium - Iterating to next row

Question

My code goes into a website, and there is a table where each row has a javascript window that pops up when you click on it

I want my code to iterate and click each row, which would prompt the second window to open, perform some action, and then close this window and move on to the next row.

However my code loops when it closes the first window, it clicks back on the first row again. Never moving on to row #2.

from selenium import webdriver

from bs4 import BeautifulSoup
import pandas as pd
import time
import requests
driver = webdriver.Chrome()
vals=[]
finalz=[]
productlink=[]
driver.get('https://aaaai.planion.com/Web.User/SearchSessions?ACCOUNT=AAAAI&CONF=AM2021&USERPID=PUBLIC&ssoOverride=OFF')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source,'html.parser')
productlist=soup.find_all('tr',class_='clickdiv')

for item in productlist:
    ea = item.find_all('td')
    title=ea[0].text
    sam=driver.find_element_by_class_name('clickdiv') #opens the window
    sam.click()
    time.sleep(1)
    cl=driver.find_element_by_class_name('XX') #this is the close window button
    cl.click()

seagullnutkin · Accepted Answer · 2021-02-01 20:22:43Z

2

As your code is written, sam=driver.find_element_by_class_name('clickdiv') will always find the first row. The driver is on the page with the table and is searching that page for just the first element that has the class "clickdiv", because you're using find_element_by_class_name() instead of find_elements_by_class_name(). So, it's just finding the first thing that has class "clickdiv", which is the first row of the table.

Rather than using BeautifulSoup to identify all the rows to iterate through, you should find all those elements using the Selenium driver, then iterate through those rows and click them.

productlist = driver.find_elements_by_class_name('clickdiv')

for item in productlist:
   title = item.find_element_by_css_selector("td").text
   item.click()
   time.sleep(1)
   driver.find_element_by_class_name('XX').click() #close window

answered Feb 1, 2021 at 20:22

seagullnutkin

3211 silver badge13 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Void S Over a year ago

Thanks, works perfect. Would you know how to get the next time in the table using by css_selector approach? For example, we have the title with this method, but what if I want to get the next column - 'Type'? That is also a td, how would we distinguish that?

seagullnutkin Over a year ago

You can use item.find_elements_by_css_selector("td") to get a list of everything within the row that is in td tags. So if the table always has the same columns, you could just index into the list to get desired info (e.g. row = item.find_elements_by_css_selector("td"); title = row[0].text; type = row[1].text)

Void S Over a year ago

Thank you so much!!

QHarr · Accepted Answer · 2021-02-01 20:14:11Z

0

Just extract the links in advance from the soup then driver.get to each one. The links are within the onclick attribute. I use a simple regex to extract the final url from that attribute javascript instruction.

import re # additional import

protocol = 'https:'  # additional global variable

# your code.....

# extract links and visit each
page_source = driver.page_source
soup = BeautifulSoup(page_source,'lxml')
links = [protocol + re.search(r',"(.*?)"', i['onclick']).groups(0)[0] for i in soup.select('.clickdiv')]

for link in links:
    driver.get(link)
    # do something with each page

answered Feb 1, 2021 at 20:14

QHarr

84.5k14 gold badges58 silver badges105 bronze badges

Collectives™ on Stack Overflow

Python/Selenium - Iterating to next row

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related