How does one wait until all of the Javascript is loaded before curling a website? I am trying to download the HTML from one of my pages, but it fetches information asynchronously, so cURL fetches a half loaded page. Is there a way to get cURL to fetch a fully loaded page?
2
-
cURL cant process javascript.Shubham– Shubham2012-07-03 17:02:36 +00:00Commented Jul 3, 2012 at 17:02
-
cURL does not execute Javascript. It will load the initial document served by the web server and nothing else. Any Javascript that is executed to modify the DOM will have no effect on what you are able to load with cURL.DaveRandom– DaveRandom2012-07-03 17:03:02 +00:00Commented Jul 3, 2012 at 17:03
Add a comment
|
1 Answer
You need to use a headless browser engine to do this. cURL and wget are HTTP libraries; they speak HTTP and download documents as text. They don't have a concept of a DOM or a JavaScript engine that would help them understand that a page is doing AJAX. So to download the HTML, you need something that acts more like a browser, by parsing a DOM and executing JS. I recommend Crowbar, which uses a Mozilla engine.