I'm trying to scrape a webpage using Selenium (in Python) that is almost entirely Javascript.
For instance, this is the body of the page:
<body class="bodyLoading">
<!-- this is required for GWT history support -->
<iframe id="__gwt_historyFrame" role="presentation" width="0" height="0" tabindex="-1" title="empty" style="position:absolute;width:0;height:0;border:0" src="javascript:''"> </iframe>
<!-- For printing window contents -->
<iframe id="__printingFrame" role="presentation" width="0" height="0" tabindex="-1" title="empty" style="width:0;height:0;border:0;" />
<!-- TODO : RECOMMENDED if your web app will not function without JavaScript enabled -->
<noscript>
<div style="width: 22em; position: absolute; left: 50%; margin-left: -11em; color: red; background-color: white; border: 1px solid red; padding: 4px; font-family: sans-serif">
Your web browser must have JavaScript enabled in order for
Regulations.gov to display correctly.
</div>
</noscript>
</body>
For some reason, selenium (using the Firefox engine) does not evaluate the javascript on this page. If I use the get_html_source function, it just returns the html above, not the JavaScript imported HTML that I can see in my browser (and in the Selenium browser). And, unfortunately, I can't figure out the src attibute from the iFrame just says javascript: which I can't figure out.
Any thoughts on how to make sure Selenium process this iFrame?