1

I needed some help to create a count query on nested objects in a field, across all documents. Each document json has a many fields. One particular field called "hotlinks" comprises of many internal dynamic object fields.

Doc1:
{
  hotlinks : { 112222:{....} , 333333: {.....} , 545555: {.....}      }
}

Doc2:
{
  hotlinks : { 67756:{....} , 756767: {.....} , 1111111: {.....}      }
}

Each document has a hotlinks fields. The hotlinks field comprises of varied inner hotlink objects. Each key is a java unique id and has objects that contain data (inner fields).

I needed a way to get the count of all the inner nested objects of the field – ‘hotlinks’. For example the summation of inner objects of hotlinks in doc1 and doc2 would be 6.

Is there any way to do this via a single query to get the count across all documents.

Thanks a lot, Karan

3
  • Without using mapreduce functionality, I'm pretty sure you'd have to simply iterate over the cursor and "manually" count the occurrences within the sub-document. Commented Aug 12, 2013 at 9:58
  • 1
    The most efficient would be to store the count when you save or modify the document Commented Aug 12, 2013 at 10:38
  • 1
    Unfortunately as of 2.4 there is no ability to do this using the aggregation framework. If you hotlinks had been in an array you could have used $unwind and $sum to calculate the totals but in this case you'll have to use mapReduce or do it in the client Commented Aug 13, 2013 at 11:45

2 Answers 2

3

Quite possible if using MongoDB 3.6 and newer though the aggregation framework. Use the $objectToArray operator within an aggregation pipeline to convert the document to an array. The return array contains an element for each field/value pair in the original document. Each element in the return array is a document that contains two fields k and v.

On getting the array, you can then leverage the use of the $size operator which returns the number of elements in the given array thus giving you the count per document.

Getting the count across all the documents requires a $group pipeline where you specify the _id key of null or a constant value which gives calculates accumulated values for all the input documents as a whole.

All this can be done in a single pipeline by nesting the expressions as follows:

db.collection.aggregate([
    { "$group": {
        "_id": null,
        "count": {
            "$sum": { 
                "$size": { "$objectToArray": "$hotlinks" }
            }
        }
    } }     
])

Example Output

{
    "_id" : null,
    "count" : 6
}

Sign up to request clarification or add additional context in comments.

Comments

1

this may not be the best approach, but you can define a javascript variable and sum up the counts. i.e;

var hotlinkTotal=0;
db.collection.find().forEach(function(x){hotlinkTotal+=x.hotlinks.length;});
print(hotlinkTotal);

3 Comments

Yep - if not this way, a mapreduce operation would be the other alternative.
Hi Lix And Erdimeola, Thanks so much for your reply. I have thousands of documents in my collection. Thus i need a fast and efficient way to get the count. I feel the javascript way would be a heavy operation. Would mapreduce be efficient... Thanks so much, Karan
Thank you everyone

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.