1

I've done hours of research on asynchronous programming, but I just can't seem to grasp this one single concept within Node, so I was wondering if someone here could help me.

I've written the following code sample to return / output a simple string which is a concatenation of strings from an object:

var itemCollection = {
    item1 : [{ foo : "bar" }, { foo : "bar" }, { foo : "bar" }],
    item2 : [{ foo : "bar" }, { foo : "bar" }, { foo : "bar" }],
    item3 : [{ foo : "bar" }, { foo : "bar" }, { foo : "bar" }]
}

var aString = "";

for(item in itemCollection){
    for (var i = 0; i < itemCollection[item].length; i++) {
        var anItem = itemCollection[item][i];

        //someFunctionThatDoesALongIOOperation takes an item as a param, plus a callback.
        someFunctionThatDoesALongIOOperation(anItem, function(dataBackFromThisFunction){
            // Do something with the data returned from this function
            aString += dataBackFromThisFunction.dataToAppend;
        });
    };
}

console.log(aString);

So from what I understand, languages other than Javascript would run someFunctionThatDoesALongIOOperation synchronously and the script would run in a 'blocking mode'. This would mean that the value aString would get returned / outputted with its correct value.

However, as Node runs asynchronously, code can continue to run at anytime and tasks may not complete in order. This is because of the way the event loop works in Node. I think I get this.

So this is where my question comes in. If I wanted the value aString to be returned / outputted with its correct value like it would in other languages, what would I need to do to the loops within my code example? Or to put my question in more technical words: What is the correct approach for making aString return the expected result, so that IO operations (which take a longer amount of time to run) aren't completed after the script has finished executing when aString has already been returned?

I hope my question makes sense, if it doesn't, please let me know and I will make edits where appropriate.

Thank you

3
  • Are you looking to make the code async? Commented Jul 24, 2015 at 0:01
  • languages other than Javascript would run someFunctionThatDoesALongIOOperation synchronously - not necesarily - depends on how the function is coded Commented Jul 24, 2015 at 0:08
  • Node.js, by defualt is not just somehow async, but the API is, and it is encouraged that you write you code to be async Commented Jul 24, 2015 at 0:24

1 Answer 1

1

Since the function you apply to each item is asynchronous, the loop that processes them also must be asynchronous (likewise the function that consumes the result of this loop must also be async). Check out Bob Nystrom's "What Color is Your Function?" for more insight on this particular point.

There's two ways to do this (both using caolan's async library to wrap all the nasty callback logic):

  • Do one async operation one at a time, waiting for the previous to finish before the next can begin. This is probably most similar to the way a traditional synchronous loop runs. We can do this with async.reduce:

    async.reduce(itemCollection, "", function(memo, item, callback) {
        someFunctionThatDoesALongIOOperation(item, function(dataBackFromThisFunction) {
            callback(null, memo + dataBackFromThisFunction.dataToAppend);
        });
    }, function(err, result) {
        var aString = result;
    });
    
  • Of course, there's little point in having async code if we don't actually reap it's benefits and execute many things at once. We can do all the async operations in parallel and reduce all at once in a single step afterwards. I've found this is great if processing each item requires some long operation such as network I/O, since we can kick off and wait for many requests at once. We use async.map to achieve this:

    async.map(itemCollection, function(item, cb) {
        someFunctionThatDoesALongIOOperation(item, function(dataBackFromThisFunction) {
            cb(null, dataBackFromThisFunction.dataToAppend);
        });
    }, function(err, results) {
        var aString = results.join('');
    });
    
Sign up to request clarification or add additional context in comments.

4 Comments

Why map and reduce instead of just each or parallel?
reduce is executed in series and is probably more semantic in the particular application. map is picked since the callback is only called when everything is finished and can be reduced separately afterwards. each executes everything in parallel, so the order of the result still isn't guaranteed and parallel can be equivalent to map, but requires a separate function for each element in the array (and still needs to be reduced manually afterwards).
Aye, sorry meant series not each. But yeah I see what you're suggesting.
series would work, but like parallel, is intended for control flow of many different async operations, rather than applying a same operation on many things.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.