1

I've got the following code routine that works great. The only problem is that I need the results to come back in the same order of the links array. For example I need the results of the google.com link to come back first, then yahoo, etc. The code currently returns in a "random" order.

var Nightmare = require('nightmare');
var async = require('async');
var links = [
    "http://www.google.com",
    "http://www.yahoo.com",
    "http://www.bing.com",
    "http://www.aol.com",
    "http://duckduckgo.com",
    "http://www.ask.com"
  ];

var scrape = function(url, callback) {
  var nightmare = new Nightmare();
  nightmare.goto(url);
  nightmare.wait('body');
  nightmare.evaluate(function () {
    return document.querySelector('body').innerText;
  })
  .then(function (result) {
    console.log(url, result);
  })
  nightmare.end(function() {
    callback();
  });
}

async.map(links, scrape);

UPDATE: Thanks @christophetd. Here is my revised working code:

var Nightmare = require('nightmare');
var async = require('async');
var links = [
    "http://www.google.com",
    "http://www.yahoo.com",
    "http://www.bing.com",
    "http://www.aol.com",
    "http://duckduckgo.com",
    "http://www.ask.com"
  ];

var scrape = function(url, callback) {
  var nightmare = new Nightmare();
  nightmare.goto(url);
  nightmare.wait('body');
  nightmare.evaluate(function () {
    return document.querySelector('body').innerText;
  })
  .then(function (result) {
    callback(null, url+result);
  });
  nightmare.end();
}

async.map(links, scrape, function (err, results) {
  if (err) return console.log(err);
  console.log(results);
})
1
  • what you describe is not async... can you re-order them once you have them all? Commented May 24, 2016 at 20:58

1 Answer 1

5

From the official async documentation :

the results array will be in the same order as the original collection

Which is pretty easy to verify:

// This function waits for 'number' seconds, then calls cb(null, number)
var f = function (number, cb) {
    setTimeout(function () {
        cb(null, number)
    }, number * 1000)
}

async.map([4, 3, 2, 1], f, function (err, results) {
    console.log(results); // [4, 3, 2, 1]
})

As you can see in the code above, even if the processing of the argument 4 by f takes more time than the element 3, it will still be first in the results.


In the case of your code, writing:

async.map(links, scrape, function (err, results) {
    if (err) {
        // handle error, don't forget to return
    }
    // results will be in the same order as 'links'
})

Should give you the expected result.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, this was perfect. I had seen the answer you referenced in the docs, but for some reason it didn't make sense until I saw your answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.