How to scroll correctly in a dynamically-loading webpage with Selenium?

Question

Here's the link of the website : website

I would like to have all the links of th hotels in this location.

Here's my script :

import pandas as pd
import numpy as np
from selenium import webdriver
import time

PATH = "driver\chromedriver.exe"

options = webdriver.ChromeOptions() 
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1200,900")
options.add_argument('enable-logging')


driver = webdriver.Chrome(options=options, executable_path=PATH)

driver.get('https://fr.hotels.com/search.do?destination-id=10398359&q-check-in=2021-06-24&q-check-out=2021-06-25&q-rooms=1&q-room-0-adults=2&q-room-0-children=0&sort-order=BEST_SELLER')

cookie = driver.find_element_by_xpath('//button[@class="uolsaJ"]')
try:
    cookie.click()
except:
    pass

for i in range(30):
    driver.execute_script("window.scrollBy(0, 1000)")
    time.sleep(5)

time.sleep(5)

my_elems = driver.find_elements_by_xpath('//a[@class="_61P-R0"]')

links = [my_elem.get_attribute("href") for my_elem in my_elems]


X = np.array(links)
print(X.shape)
#driver.close()

But I cannot find a way to tell the script : scroll down until there is nothing more to scroll.

I tried to change this parameters :

for i in range(30):
    driver.execute_script("window.scrollBy(0, 1000)")
    time.sleep(30)

I changed the time.sleep(), the number 1000 and so on but my output keep changing and not in the right way.

output

As you can see, I have scraped a lot of numbers differents. How to make my script scraping a same amout each time ? Not necessarily each links but at last a stable number.

Here it scroll and at one point it seems blocked and scrape all the links it has at the moment. That's not appropriate.

Prophet · Accepted Answer · 2021-06-25 08:40:21Z

2

There are several issues here.

You are getting the elements and their links only AFTER you finished scrolling while you should do that inside the scrolling loop.
You should wait until the cookies alert is appearing to close it.
You can scroll until the footer element is presented.
Something like this:

import pandas as pd
import numpy as np
from selenium import webdriver
import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

PATH = "driver\chromedriver.exe"

options = webdriver.ChromeOptions() 
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1200,900")
options.add_argument('enable-logging')


driver = webdriver.Chrome(options=options, executable_path=PATH)
wait = WebDriverWait(driver, 20)

driver.get('https://fr.hotels.com/search.do?destination-id=10398359&q-check-in=2021-06-24&q-check-out=2021-06-25&q-rooms=1&q-room-0-adults=2&q-room-0-children=0&sort-order=BEST_SELLER')

wait.until(EC.visibility_of_element_located((By.XPATH, '//button[@class="uolsaJ"]'))).click()

def is_element_visible(xpath):
    wait1 = WebDriverWait(driver, 2)
    try:
        wait1.until(EC.visibility_of_element_located((By.XPATH, xpath)))
        return True
    except Exception:
        return False

while not is_element_visible("//footer[@id='footer']"):
    my_elems = driver.find_elements_by_xpath('//a[@class="_61P-R0"]')

    links = [my_elem.get_attribute("href") for my_elem in my_elems]

    X = np.array(links)
    print(X.shape)

    driver.execute_script("window.scrollBy(0, 1000)")
    time.sleep(5)


#driver.close()

edited Jun 25, 2021 at 8:40

answered Jun 25, 2021 at 8:16

Prophet

33.5k28 gold badges58 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

RandallCloud Over a year ago

Thanks Prophet, always here to help :) I will check your code asap

Prophet Over a year ago

I have updated the answer since I think it was a problem there. Now it should be better. One day I will learn Python :)

RandallCloud Over a year ago

while not find_elements_by_xpath("//footer[@id='footer']"): NameError: name 'find_elements_by_xpath' is not defined

Prophet Over a year ago

@RandallCloud I fixed that more than 3 hours ago... See the updated answer

RandallCloud Over a year ago

It seems that doesn't do anything.. The page doesn't scroll and the script end with nothing

|

Dmitriy Zub · Accepted Answer · 2021-07-07 06:18:15Z

1

You can try this by directly calling the DOM and locate some element that will be only at the bottom of the page with .is_displayed() selenium method which returns true/false:

# https://stackoverflow.com/a/57076690/15164646
while True:
  # it will be returning false until the element is located
  # "#message" id = "No more results" at the bottom of the YouTube search
  end_result = driver.find_element_by_css_selector('#message').is_displayed() 
  driver.execute_script("var scrollingElement = (document.scrollingElement || document.body);scrollingElement.scrollTop = scrollingElement.scrollHeight;")

  # further code below
  
  # once the element is found it returns True. If so, it will break out of the while loop
  if end_result == True:
    break

I wrote a blog post where I used this method to scrape YouTube Search.

edited Jul 7, 2021 at 6:18

answered Jun 25, 2021 at 10:21

Dmitriy Zub

1,7641 gold badge12 silver badges38 bronze badges

3 Comments

RandallCloud Over a year ago

The script seems tu run endlessly ?

Dmitriy Zub Over a year ago

Indeed! Thank you for letting me know! As soon as it be fixed I'll add another comment here so you know.

Dmitriy Zub Over a year ago

Hey @RandallCloud! I updated the answer. Now it break out of a while loop when the element at the bottom of the page is located.

Collectives™ on Stack Overflow

How to scroll correctly in a dynamically-loading webpage with Selenium?

2 Answers 2

12 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

12 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related