2

So I have a piece of HTML that looks something like this:

<html>
  <body>
    <script>
      var foo = {
        bar: []
      };
    </script>
  </body>
</html>

And I am trying to use PhantomJS to extract the value of foo.bar. How would I do this? So far I know I would have is structured like this:

var webPage = require('webpage'); 
var page = webPage.create();
page.open(MY_URL, function(status) {
  var foo = page.evaluate(function(){
    //gets javascript from the HTML in the response
    // and extracts foo from there
  });
});

console.log(someVar);
phantom.exit();

2 Answers 2

1

Seems you should just be able to use

var foo = page.evaluate(function() {
  return window.foo
})
console.log('foo =', foo)
Sign up to request clarification or add additional context in comments.

4 Comments

I should be able to. However, there is one problem. Although I can run window.foo, I can't access the foo.bar array even though bar comes up when I type console.log(Object.keys(foo));.
@TheLegendOfCode In that case, it seems your example above is a poor one
I changed return window.foo to return JSON.stringify(window.foo) and it says that foo.bar is completely empty. However, when I threw the URL into google and printed foo.bar in the console, it was full.
@TheLegendOfCode obviously something is populating the foo.bar array after the page initially loads. Like I said, your example does not represent reality
0

Here's how you would, open url from phantom virtual browser and get a javascript returned value from webpage with phantom

const phantom = require('phantom');


openUrl = (req, res, next) => {
    let url = 'your url goes here';
    const {content, returnedFromPage} = await loadJsSite(url);
    return res.json(returnedFromPage);
}

loadJsSite = async (url) => {
  return new Promise( async (resolve, reject) => {

    const instance = await phantom.create();
    const page = await instance.createPage();
    await page.on('onResourceRequested', function(requestData) {
      console.info('Requesting', requestData.url);
    });

    const status = await page.open(url);
    var returnedFromPage = await page.evaluate(function() {
            return document.title;
        });
    const content = await page.property('content');

    await instance.exit();

    return resolve({content: content, returnedFromPage: returnedFromPage});

  })
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.