How do I fire multiple javascript events on a webpage using python?

Question

I am webscraping Glassdoor.com for company reviews using Python.

Currently, I am using Beautiful Soup and grequests. This is working fine for all the fields I need, except for the "Advice to Management" section which only loads in once the Continue Reading button is pressed. See below an example below for this page of reviews:

continue reading button expanded review

There are no changes to the URL as far as I can tell, but there is a JS click-event being fired in the console: Event: EiReviews: Click [continueReading-71858088]

I found a tutorial online for selenium webdriver such as this one, and I wrote this code:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome (executable_path="C:\\chromedriver.exe")
driver.get("https://www.glassdoor.com/Reviews/Alteryx-Reviews-E351220.htm")

btn = driver.find_element(By.CLASS_NAME, "v2__EIReviewDetailsV2__continueReading").click()
driver.execute_script ("arguments[0].click();",btn)

I need something that scales better, as this takes ~20sec to open chrome and click on a singular button. I need to be able to click on every "Continue Reading" button on the page as my end goal is to scrape every review for ~1,000 companies.

By looking at the HTML of the page, you can notice that right before the <div id="Container"> object, there is a script object starting with window.appCache={.... which contains the complete reviews but in a sort of a strange dictionary/json format, for example it contains the text which appears when you click on Continue Reading "summary":"Great place to work, been here 4+ years","summaryOriginal":null,"advice":"Don't rush too finish a project". Maybe you can extract everything from there — sound wave
– sound wave, Commented Jan 2, 2023 at 10:54
Alternatively, you can load the site with selenium, loop through all the reviews and automatically click the Continue Reading button if present — sound wave
– sound wave, Commented Jan 2, 2023 at 10:55
Thanks! The window.appCache dict has all the information I need. — Dunc
– Dunc, Commented Jan 2, 2023 at 21:37
Good! Is it ok if I post the comment with the solution as an answer so that you can then accept it and the question is closed? — sound wave
– sound wave, Commented Jan 3, 2023 at 7:53

sound wave · Accepted Answer · 2023-01-04 19:14:35Z

1

By looking at the HTML of the page, you can notice that right before the <div id="Container"> object, there is a script object starting with window.appCache={.... which contains the complete reviews in a dictionary format, for example it contains the text which appears when you click on Continue Reading

"summary":"Great place to work, been here 4+ years",
"summaryOriginal":null,"advice":"Don't rush too finish a project"

answered Jan 4, 2023 at 19:14

sound wave

3,5673 gold badges14 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How do I fire multiple javascript events on a webpage using python?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related