4

I'm using PhantomJS as a crawler; if there is no JS in a page I can assume that it's completely loaded when onLoadFinished fires, but if there is JS in a page, I need to wait a bit to give the scripts a chance to do stuff. This is my current stab at detecting JS:

var pageHasJS = page.evaluate(function () {
    return (document.getElementsByTagName("script").length > 0 ||
            document.evaluate("count(//@*[starts-with(name(), 'on')])",
                              document, null, XPathResult.NUMBER_TYPE,
                              null).numberValue > 0);
})

This looks for <script> tags and for elements with an onsomething attribute.

Q1: Is there any other HTML construct that can sneak JS into a page? javascript: URLs do not count, because nothing will ever get clicked.
Q2: Is there a better way to do the second test? I believe it is not possible to do that with querySelector, hence resorting to XPath, but maybe there is some other feature that would accomplish the same task.
Q3: The crawler does not interact with the page once it is loaded. The onload event is the only legacy event attribute that I know of that fires in the absence of user interaction. Are there any others? In other words, would it be safe to replace the second test with document.evaluate("count(//@onload)", ...) or maybe even !!document.body.getAttribute("onload")?

1
  • I think you're good. There may be js in an onunload attribute, but this should not concern you. Commented May 22, 2014 at 0:41

1 Answer 1

1

Instead of checking for script tags and giving fixed amount of time, you can intercept the actual HTTP request (take a look at onResourceRequested / onResourceReceived) and take the screenshot after all resources have been loaded. Take a look at ajax-render

Sign up to request clarification or add additional context in comments.

1 Comment

That's unfortunately not good enough: consider the surprisingly common case of a page that uses JavaScript to change window.location after a delay of seconds to minutes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.