Can't run JavaScript using requests-html library on python

Question

I need to pull out some information from some links that contain some javascript code. I know how to do it with Selenium, but it takes a lot of time and I need more efficient way to pull this off.

I cam across the requests-html library and it looks quite robust way for my purposes, but unfortunately it doesn't look like I'm able to run the javascript with it.

I read the documentation from the following link https://requests-html.readthedocs.io/en/latest/

And tried the following code:

from requests_html import HTMLSession,HTML
from bs4 import BeautifulSoup

session = HTMLSession()
resp = session.get("https://drive.google.com/file/d/1rZ-DhTFPCen6DvJXlNl3Bxuwj4-ULwoa/view")

resp.html.render()

soup = BeautifulSoup(resp.html.html, 'lxml')

email = soup.find_all('img', {'class':'ndfHFb-c4YZDc-MZArnb-BA389-YLEF4c'})
print(email)

I get no results after running this code, even though the class exists if I open the link from my browser.

I've also tried using headers with my requests with no help. I tried the same code (with different html tag, of course) for another link (https://web.archive.org/web/*/stackoverflow.com) but I get some html text including a response that says that my browser must support javascript. My code for this part:

from requests_html import HTMLSession
from bs4 import BeautifulSoup

session = HTMLSession()
resp = session.get("https://web.archive.org/web/*/stackoverflow.com")

resp.html.render()

soup = BeautifulSoup(resp.html.html, 'lxml')


print(soup)

The response I get:

<div class="no-script-message">
        The Wayback Machine requires your browser to support JavaScript, please email <a href="mailto:[email protected]">[email protected]</a><br/>if you have any questions about this.
      </div>

Any help would be appreciated. Thanks!

Learning from masters · Accepted Answer · 2021-12-22 10:24:06Z

1

In render, add sleep parameter

resp.html.render(sleep=2)

answered Dec 22, 2021 at 10:24

Learning from masters

3,0825 gold badges37 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

radomaj Over a year ago

For me, this was the answer

AlixaProDev · Accepted Answer · 2021-08-13 10:32:06Z

0

This should work on the site. But as you mentioned the code worked for the StackOverflow but did not work for the other URL? is it because the server might not respond or the tag that you are looking for may not be available at that time. but anyway the requests-HTML should have given you an error.

I was about to check your problem and add it to my blog post How to use Requests-HTMLbut unfortunately, the link you provided is not working.

answered Aug 13, 2021 at 10:32

AlixaProDev

5681 gold badge5 silver badges14 bronze badges

Collectives™ on Stack Overflow

Can't run JavaScript using requests-html library on python

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related