0

This is in a nodejs/express service using Mongo.

I have a collection of objects that each have 17 fields. I am trying to produce a set of distinct values for one particular field and their counts. There is probably a better way of doing this than I am doing, and I'm interested in hearing of it, but for now my specific problem is around the odd behavior I see with the way I am doing it.

    var collection = db.collection('docs');
    var counts = {};
    var docs = collection.find({title: {$ne: null}}, {title:1});
    while (docs.hasNext()) {
        var doc = docs.next();
        console.log('Processing ' + JSON.stringify(doc));
        counts[doc.title] = (counts[doc.title]||0) + 1;
    }

There should be about 6000 documents, but what I see is:

Processing {}
Processing {}
...

close to 20,000 times, after which there is a pause for a few seconds, and then finally a fatal error with out of memory.

If I run that same find in Robomongo, I get the results I expect, namely about 6000 documents with non-null titles.

Can anyone suggest what the problem is?

Note - I'm not looking for alternative working ways to achieve the same effect - I have those - what I'm looking for is an explanation for what is going wrong when trying to do things this way, because it AFAICT it should work, and I'd like to close the gap in my understanding. For example, using toArray things work:

    var result = collection.find({title: {$ne: null}}, {title: 1});
    results.toArray(function(e, docs) {
      console.log('Got ' + docs.length + ' results');
      for (var i = docs.length - 1; i>=0; i--) {
        var doc = docs[i];
        counts[doc.title] = (counts[doc.title]||0) + 1;
      }

I have also pasted and run this almost identical code with no problem in the Mongo shell:

    var collection = db.getCollection('docs');
    var counts = {};
    var docs = collection.find({title: {$ne: null}}, {title:1});
    while (docs.hasNext()) {
        var doc = docs.next();
        printjson(doc);
        counts[doc.title] = (counts[doc.title]||0) + 1;
    }
    printjson(counts);

That behaves as expected, and the only difference between that and the code running under node is it uses db.getCollection() versus db.collection(), and printjson() vs console.log().

So this seems to be some weird issue with running in nodejs specifically.

4
  • did you try to set your title field as index? try this and see if it works. db.collection.ensureIndex('title':1) or db.collection.createIndex('title':1) Commented Dec 3, 2016 at 7:59
  • I've tried with and without it being an index; makes no difference. Commented Dec 3, 2016 at 8:43
  • did you try to copy paste your js code and run it in mongo db shell? Commented Dec 3, 2016 at 12:53
  • I have now, and updated my answer. It works as expected in the shell, so seems to be specific to the fact I'm running it in the nodejs environment. Commented Dec 3, 2016 at 21:05

2 Answers 2

1

You can do it like this, using a processing function :

var counts = {};
collection.find({title: {$ne: null}}, function(err,docs){
    docs.nextObject(processItem);

    function processItem(err,item){
        if(!item){
           console.log('cursor exhausted').
        } else {
            console.log('Processing ' + JSON.stringify(doc));
            counts[doc.title] = (counts[doc.title]||0) + 1;
            docs.nextObject(processItem);
        }
    }      
});
Sign up to request clarification or add additional context in comments.

2 Comments

That works (with a minor change of using find().nextObject()). But I still don't understand what is going wrong in my original code, which AFAICT should work too.
just a thought, aren't you stringifying err?
1

I think I understand what I am doing wrong. I've been following the MongoDB Javascript documentation, while I should have been looking at the node Mongo driver documentation. hasNext() and next() behave very differently in the node environment in this case; in the case of the version I am running the behavior is described here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.