2

I'm scraping the follower names from the twitter using Selenium and that page is infinite, whenever I scroll down I can see new followers. Somehow I want to go to the bottom of the page so that I can scrape all the followers.

while number != 5:
   driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
   number = number + 1
   time.sleep(5)

usernames = driver.find_elements_by_class_name(
       "css-4rbku5.css-18t94o4.css-1dbjc4n.r-1loqt21.r-1wbh5a2.r-dnmrzs.r-1ny4l3l")
for username in usernames:
   print(username.get_attribute("href"))

Right now code is scrolling 5 times. I have put a static value but I don't know how much scrolls are needed to reach the bottom of the page.

3 Answers 3

6

Use below code for infinite loading. It will keep scrolling until new elements are getting loaded i.e. page size is changing.

# Get scroll height after first time page load
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    # Wait to load page / use a better technique like `waitforpageload` etc., if possible
    time.sleep(2)
    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height
Sign up to request clarification or add additional context in comments.

Comments

0

In the following script, there is no sleep time, hence it scrolls faster:

SCROLL_PAUSE_TIME = 4
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
    import datetime
    time_past = datetime.datetime.now()
    while (datetime.datetime.now() - time_past).seconds <=SCROLL_PAUSE_TIME:
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

Comments

0

I had the same problem for scroll, but the listing did not load when I scrolled to the end of the page. The problem was in large footer, So I slightly corrected the above code and scrolled to the footer.

last_height = driver.execute_script("return document.body.scrollHeight")
while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight - 1300);")
    # Wait to load page
    time.sleep(2)
    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

Maybe it will be useful for someone

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.