0

So I am trying my hand at Node.js. I want to build a simple crawler which scans a page and then returns all links back in a json file. However, when I run the script it returns 0 links.

Here is my code in its entirety:

    var request = require('request');
    var cheerio = require('cheerio');
    var fs = require("fs");

    var url = 'https://stackoverflow.com/questions';

    //Create the blank array to fill:
    var obj = {
       table: []
    };

    var i = 0;

    request(url, function(err, resp, body){
      $ = cheerio.load(body);
      links = $('a'); //jquery get all hyperlinks

      $(links).each(function(i, link){
        var actualLink = $(link).attr('href');
          obj.table.push({id: i, url:actualLink}); //add some data
          i++;
      });

    }); 

    var json = JSON.stringify(obj);

    console.log(json);

The output in the terminal is so:

$ !!

node nodetest.js

{"table":[]}

Can anyone see why this is blank? Bonus points for writing the final json to a file :)

1 Answer 1

1

You must use obj inside the success callback of the request, that's where it gets populated:

request(url, function(err, resp, body) {
    $ = cheerio.load(body);
    links = $('a'); //jquery get all hyperlinks

    $(links).each(function(i, link) {
        var actualLink = $(link).attr('href');
        obj.table.push({id: i, url:actualLink}); //add some data
    });

    // Only here you can be sure that the "obj" variable is properly
    // populated because that's where the HTTP request completes
    var json = JSON.stringify(obj);
    console.log(json);
}); 

In your code you have placed the console.log outside the request success which is asynchronous and thus the obj variable is not yet populated.

Also notice that you don't need the i variable. It will be passed to the each callback automatically, you don't need to be explicitly declaring or incrementing it.

As far as writing the result to a file is concerned, you could use the fs.writeFile function:

fs.writeFile("/tmp/test", json, function(err) {
    if(!err) {
        console.log("File successfully saved");
    }
});
Sign up to request clarification or add additional context in comments.

1 Comment

This says file successful buy then didn't do anything. I changed "/tmp/test" to "test.json" and it worked.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.