0

I'm trying to index a food recipes page, and the actual recipe is stored as an object within a JavaScript in the page.

One example URL: http://www.dagbladet.no/mat/oppskrift/bakt-potet-med-romme-og-blamuggostdressing

If I open the developer tool in the browser and type:

console.dir(food.recipeItem.title)

I get the title back:

"Bakt potet med rømme- og blåmuggostdressing"

All nice and dandy, and just what I need. But how can I get ahold of that script and parse it within a Node.js application? Cheerio will maybe help me find the script, but not do much more than that? Or maybe it will? I'm not sure how to do it, and not what is the most computation-effective answer. Or most solid.

1 Answer 1

1

It's pretty easy, all you have to do is parse the returned HTML. If you inspect the returned HTML (view-source:http://www.dagbladet.no/mat/oppskrift/bakt-potet-med-romme-og-blamuggostdressing), you will find a script tag which contains all information you need in several javascript variables. These variables holds JSON data. Since the script is hardcoded directly into the HTML document, and not obtained by XHR or similar, parsing the HTML is the only way of doing this.

So basically you have these 3 steps:

1. send HTTP GET request to the link above

2. parse the HTML string to extract the script tag by using some library (check this link to decide which library to use).

3. parse the javascript string (extracted script from step 2) to extract JSON data. Check UglifyJS library for Node.js

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @Borna ! Step 2 is the part I'm struggling with. I'll handle finding the script tag with cheerio, but do I then need to parse the content of the script? And how to do that?
Hi, I made some changes, hope it helps
Thanks, I'll try that!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.