I am trying to utilize PhantomJS to get html generated by dynamic page. I supposed that this would be easy, but after few hours of trying, I am still not lucky.
The page itself has this source code and what gets saved in 1.html eventually:
<!doctype html>
<html lang="cs" ng-app="appId">
<head ng-controller="MainCtrl">
(ommited some lines)
<script src="/js/conf/config.js?pars"></script>
<script src="/js/all.js?pars"></script>
</head>
<body>
<!--<![endif]-->
<div site-loader></div>
<div page-layout>
<div ng-view></div>
</div>
</body>
</html>
All content of web gets loaded inside site-loader div, but I have no luck to get it, even though I am using timeout before scraping html by PhantomJS. Here goes code I am using:
var url = 'http:...';
var page = require('webpage').create();
var fs = require('fs');
page.open(url, function (status) {
if (status !== 'success') {
console.log('Fail');
phantom.exit();
} else {
window.setTimeout(function () {
fs.write('1.html', page.content, 'w');
phantom.exit();
}, 2000); // Change timeout as required to allow sufficient time
}
});
Please what am I doing wrong?
EDIT: I have decided to try PJscrapper framework and configured it to scrappe all contents of div block. All I got was lousy:
["","\n\t\tif (window.DOT) {\n\t\t\tDOT.cfg({service: 'sreality', impress: false});\n\t\t}\n\t","","Loader.load()","",""]
Seems that I seriously do not get it and always get code before Loader.load() acts. And obviously, timeout does not solve it.
1.html. Please register to theonConsoleMessageandonErrorevents. Maybe there are errors. If bind is an issue, you need a shim.page.renderso you can see a screenshot. Is it dull, or does it look like the page you see in your own browser?page.open(url, function (status)resulted into status different thansuccess. When I did screenshot, it is dull...