how to fetch dynamic web content using python?

Question

I want to fetch dynamic content of webpages. I have tried a lot of modules in python as such mechanize, urllib, BS4 and has also used simple_html_dom module in PHP but none of them help me to correctly fetch content of a dynamic page.

I have tried this code:

import urllib2
url = '<url>'
req = urllib2.Request(url)
f = urllib2.urlopen(req)
a = open("E://<url>.html","a")
for x in f:
    a.write(str(x))
f.close()
print "succesful fetching"

and then opened in browser without being connected to internet , it didn’t have content which you will get when you are connected to internet. My need is to crawl such dynamic pages and it won't be possible until you have stored the whole actual HTML (that will spawn when URL is opened in some browser) in some variable . This modules is fetching static content.

Could you please post an example of the code that you tried and what exactly it is you are trying to achieve? — kylieCatt
– kylieCatt, Commented May 20, 2015 at 13:48
I can (unsurprisingly) get that webpage with a 3 line python "requests" script — Vorsprung
– Vorsprung, Commented May 20, 2015 at 15:10

Anthon · Accepted Answer · 2017-04-04 20:06:42Z

1

On modern websites using JavaScript this simplistic approach doesn't work. You will either have to load all the JavaScript and execute the JavaScript on your loaded HTML, or, the more simple solution, use some library that launches a real browser like selenium.

That way the browser loads the page, and executes all of the dynamic code. The only problem remains is to see if it has stopped loading (as JavaScript cannot indicate it is finished). I normally look at some element I know to be dynamically loaded and retry to see if it is there with increasing intervals until I time out.

Once you decide enough dynamic content is there you can start parsing the HTML with selenium's built in DOM search routines.

edited Apr 4, 2017 at 20:06

user2454725

answered May 23, 2015 at 9:24

Anthon

78.3k35 gold badges207 silver badges290 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

how to fetch dynamic web content using python?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related