Selenium navigation through selenium keep looping (python)

Question

I'm just started using selenium to scrape the table from webpage. So, I implemented the navigation of webpage using selenium. But, the the result keep looping when I run the code. Pretty sure that I wrote the code wrong. What should I fix the code so the navigation selenium works?

import requests
    import csv
    from bs4 import BeautifulSoup as bs
    from selenium import webdriver

browser=webdriver.Chrome()
browser.get('https://dir.businessworld.com.my/15/posts/16-Computers-The-Internet')

# url = requests.get("https://dir.businessworld.com.my/15/posts/16-Computers-The-Internet/")
soup=bs(browser.page_source)

filename = "C:/Users/User/Desktop/test.csv"
csv_writer = csv.writer(open(filename, 'w'))

pages_remaining = True

while pages_remaining:
    for tr in soup.find_all("tr"):
        data = []
        # for headers ( entered only once - the first time - )
        for th in tr.find_all("th"):
            data.append(th.text)
        if data:
            print("Inserting headers : {}".format(','.join(data)))
            csv_writer.writerow(data)
            continue

        for td in tr.find_all("td"):
            if td.a:
                data.append(td.a.text.strip())
            else:
                data.append(td.text.strip())
        if data:
            print("Inserting data: {}".format(','.join(data)))
            csv_writer.writerow(data)

try:
    #Checks if there are more pages with links
    next_link = driver.find_element_by_xpath('//*[@id="content"]/div[3]/table/tbody/tr/td[2]/table/tbody/tr/td[6]/a ]')
    next_link.click()
    time.sleep(30)
except NoSuchElementException:
    rows_remaining = False

KunduK · Accepted Answer · 2020-01-06 10:17:33Z

1

Check if there any next button present on the page then click else exit from while loop.

if len(browser.find_elements_by_xpath("//a[contains(.,'Next')]"))>0:
      browser.find_element_by_xpath("//a[contains(.,'Next')]").click()
else:
      break

No need to use time.sleep() instead use WebDriverWait()

Code:

import csv
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

browser=webdriver.Chrome()
browser.get('https://dir.businessworld.com.my/15/posts/16-Computers-The-Internet')
WebDriverWait(browser, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.postlisting")))
soup=bs(browser.page_source)

filename = "C:/Users/User/Desktop/test.csv"
csv_writer = csv.writer(open(filename, 'w'))

pages_remaining = True

while pages_remaining:
    WebDriverWait(browser,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"table.postlisting")))
    for tr in soup.find_all("tr"):
        data = []
        # for headers ( entered only once - the first time - )
        for th in tr.find_all("th"):
            data.append(th.text)
        if data:
            print("Inserting headers : {}".format(','.join(data)))
            csv_writer.writerow(data)
            continue

        for td in tr.find_all("td"):
            if td.a:
                data.append(td.a.text.strip())
            else:
                data.append(td.text.strip())
        if data:
            print("Inserting data: {}".format(','.join(data)))
            csv_writer.writerow(data)


    if len(browser.find_elements_by_xpath("//a[contains(.,'Next')]"))>0:
        browser.find_element_by_xpath("//a[contains(.,'Next')]").click()
    else:
        break

answered Jan 6, 2020 at 10:17

KunduK

33.4k5 gold badges19 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

swm Over a year ago

It didnt work. I tried to copy xpath for 'Next' button and the result is //*[@id="content"]/div[3]/table/tbody/tr/td[2]/table/tbody/tr/td[4]/a .So, I substitute the "//a[contains(.,'Next')]" with it. and it didnt work as well. How should I change the xpath so the navigation selenium works?

KunduK Over a year ago

Please just copy the entire code.I have tested it clicks on next button.You have only 2 pages of data right.

swm Over a year ago

Do you know why it keep looping scrape page 1? and didnt scraped the other pages?

KunduK Over a year ago

You said before It is working as expected.However it is clicking on each next button found on webpage.I haven't check your data.

swm Over a year ago

I just checked the data. However, thanks to you the pagination works, it just that the selenium keeps scraping the same page even after it navigated to the other page

Collectives™ on Stack Overflow

Selenium navigation through selenium keep looping (python)

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related