3

I am working on a screen scraping tool in Python. But, as I look through the source of the webpage, I noticed that most of the data is coming through Javascript.

Any idea, how to scrape javascript based webpage ? Any tool in Python ?

Thanks

3
  • 3
    Why not just consume the Javascript directly? Commented Nov 18, 2011 at 14:07
  • 2
    Duplicate of stackoverflow.com/questions/2148493/… Commented Nov 18, 2011 at 14:16
  • Why you do consume the Javascript directly ? For instance how do you call the JS function JS_Function(var1,var2,var3) from python ? Commented Nov 18, 2011 at 21:34

3 Answers 3

5

Scraping javascript-based webpages is possible with selenium. In particular, try the Selenium WebDriver.

Sign up to request clarification or add additional context in comments.

2 Comments

I tried Selenium. I donot want to mimic the user action. As I see it from running a sample program, it opens browser window and mimics the action. I donot want that. I want to extract the data from the webpage into my code.
You don't have to mimic user actions if you don't need to. Just download the page and parse it. The point of using selenium is that it processes javascript for you.
4

I use webkit, which is the browser renderer behind Chrome and Safari. There are Python bindings to webkit through Qt.

And here is a full Python example to execute JavaScript and extract the final HTML.

Comments

3

You can use the QtWebKit module of the PyQt4 library

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.