1

My code goes into a website, and there is a table where each row has a javascript window that pops up when you click on it

I want my code to iterate and click each row, which would prompt the second window to open, perform some action, and then close this window and move on to the next row.

However my code loops when it closes the first window, it clicks back on the first row again. Never moving on to row #2.

from selenium import webdriver

from bs4 import BeautifulSoup
import pandas as pd
import time
import requests
driver = webdriver.Chrome()
vals=[]
finalz=[]
productlink=[]
driver.get('https://aaaai.planion.com/Web.User/SearchSessions?ACCOUNT=AAAAI&CONF=AM2021&USERPID=PUBLIC&ssoOverride=OFF')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source,'html.parser')
productlist=soup.find_all('tr',class_='clickdiv')

for item in productlist:
    ea = item.find_all('td')
    title=ea[0].text
    sam=driver.find_element_by_class_name('clickdiv') #opens the window
    sam.click()
    time.sleep(1)
    cl=driver.find_element_by_class_name('XX') #this is the close window button
    cl.click()

2 Answers 2

2

As your code is written, sam=driver.find_element_by_class_name('clickdiv') will always find the first row. The driver is on the page with the table and is searching that page for just the first element that has the class "clickdiv", because you're using find_element_by_class_name() instead of find_elements_by_class_name(). So, it's just finding the first thing that has class "clickdiv", which is the first row of the table.

Rather than using BeautifulSoup to identify all the rows to iterate through, you should find all those elements using the Selenium driver, then iterate through those rows and click them.

productlist = driver.find_elements_by_class_name('clickdiv')

for item in productlist:
   title = item.find_element_by_css_selector("td").text
   item.click()
   time.sleep(1)
   driver.find_element_by_class_name('XX').click() #close window 
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, works perfect. Would you know how to get the next time in the table using by css_selector approach? For example, we have the title with this method, but what if I want to get the next column - 'Type'? That is also a td, how would we distinguish that?
You can use item.find_elements_by_css_selector("td") to get a list of everything within the row that is in td tags. So if the table always has the same columns, you could just index into the list to get desired info (e.g. row = item.find_elements_by_css_selector("td"); title = row[0].text; type = row[1].text)
Thank you so much!!
0

Just extract the links in advance from the soup then driver.get to each one. The links are within the onclick attribute. I use a simple regex to extract the final url from that attribute javascript instruction.

import re # additional import

protocol = 'https:'  # additional global variable

# your code.....

# extract links and visit each
page_source = driver.page_source
soup = BeautifulSoup(page_source,'lxml')
links = [protocol + re.search(r',"(.*?)"', i['onclick']).groups(0)[0] for i in soup.select('.clickdiv')]

for link in links:
    driver.get(link)
    # do something with each page

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.