1

I have a collection. I am trying to get an aggregate sum/count of a field in the record. I also need an aggregate sum/count of a nested array field in the record. I am using MongoDB 3.0.0 with Jongo.

Please find my record below:

db.events.insert([{

"eventId": "a21sda2s-711f-12e6-8bcf-p1ff819aer3o",
"orgName": "ORG1",
"eventName": "EVA2",
"eventCost": 5000,
"bids": [{
        "vendorName": "v1",
        "bidStatus": "ACCEPTED",
        "bidAmount": 4400
    },{
        "vendorName": "v2",
        "bidStatus": "PROCESSING",
        "bidAmount": 4900
    },{
        "vendorName": "v3",
        "bidStatus": "REJECTED",
        "bidAmount": "3000"
    }] }, {
"eventId": "4427f318-7699-11e5-8bcf-feff819cdc9f",
"orgName": "ORG1",
"eventName": "EVA3",
"eventCost": 1000,
"bids": [ {
        "vendorName": "v1",
        "bidStatus": "REJECTED",
        "bidAmount": 800
    }, {
        "vendorName": "v2",
        "bidStatus": "PROCESSING",
        "bidAmount": 900
    },{
        "vendorName": "v3",
        "bidStatus": "PROCESSING",
        "bidAmount": 990
    }] }])

I need $eventCount and $eventCost where I aggregate $eventCost field. I get $acceptedCount and $acceptedAmount by aggregating $bids.bidAmount field (with a condition on $bids.bidStatus)

The result I need would be in form:

[
{
"_id" : "EVA2",
"eventCount" : 2,
"eventCost" : 10000,
"acceptedCount" : 2,
"acceptedAmount" : 7400 },
{ 
"_id" : "EVA3",
"eventCount" : 1,
"eventCost" : 1000 ,
 "acceptedCount" : 0,
"acceptedAmount" : 0 },
}]

I am not able to get the result in a single query. Right now I make two Queries A and Query B(refer below) and merge them in my Java Code. I use an $unwind operator in my Query B.

Is there a way I can the achieve the same result, in a single query. I feel all I need is a way to pass the bids[] array downstream for the next operation in the pipeline.

I tried $push operator, but I am not able to figure, a way to push the entire bid[] array downstream.

I don't want to change my record structure, but if there is something intrinsically wrong, I could give it a try. Thanks for all your help.

My Solution

Query A:

db.events.aggregate([
    {$group: {
        _id: "$eventName",
        eventCount:     {$sum: 1}, // Get count of all events
        eventCost: {$sum: "$eventCost"} // Get sum of costs
    } }
])

Query B:

db.events.aggregate([
    {$unwind: "$bids" },
    {$group: {
        _id: "$eventName",
        // Get Count of Bids that have been accepted
        acceptedCount:{ $sum:{$cond: [{$eq: ["$bids.bidStatus","ACCEPTED"]} ,1,0] } } ,
        // Get Sum of Amounts that have been accepted
        acceptedAmount:{$sum:{$cond: [{$eq: ["$bids.bidStatus","ACCEPTED"]} ,"$bids.bidAmount",0]

    } } } }  
])

Join Query A and QueryB in Java Code.

What I need:

A single DB operation to accomplish the same

1 Answer 1

1

The problem with unwinding arrays is it's going to mess up your count's for the grouped events if you try to unwind these before you do that initial grouping, as the number of items in each document array will affect the count and sum with the deformalized documents.

Provided that is practical for your data size, there is however nothing wrong with using $push to simply create and "array" of "arrays", where of course you just process $unwind twice on each grouped document:

db.events.aggregate([
    { "$group": {
        "_id": "$eventName",
        "eventCount": { "$sum": 1 },
        "eventCost": { "$sum": "$eventCost" },
        "bids": { "$push": "$bids" }
    }},
    { "$unwind": "$bids" },
    { "$unwind": "$bids" },
    { "$group": {
        "_id": "$_id",
        "eventCount": { "$first": "$eventCount" },
        "eventCost": { "$first": "$eventCost" },
        "acceptedCount":{
            "$sum":{
                "$cond": [
                    { "$eq": [ "$bids.bidStatus","ACCEPTED" ] },
                    1,
                    0
                ]
            }
        },
        "acceptedCost":{
            "$sum":{
                "$cond": [
                    { "$eq": [ "$bids.bidStatus","ACCEPTED" ] },
                    "$bids.bidAmount",
                    0
                ]
            }
        }
    }}
])

The likely better alternative to this is to sum up the "accepted" values from each document first, and then sum those values per "event" later:

db.events.aggregate([
    { "$unwind": "$bids" },
    { "$group": {
        "_id": "$_id",
        "eventName": { "$first": "$eventName" },
        "eventCost": { "$first": "$eventCost" },
        "acceptedCount":{
            "$sum":{
                "$cond": [
                    { "$eq": [ "$bids.bidStatus","ACCEPTED" ] },
                    1,
                    0
                ]
            }
        },
        "acceptedCost":{
            "$sum":{
                "$cond": [
                    { "$eq": [ "$bids.bidStatus","ACCEPTED" ] },
                    "$bids.bidAmount",
                    0
                ]
            }
        }
    }},
    { "$group": {
        "_id": "$eventName",
        "eventCount": { "$sum": 1 },
        "eventCost": { "$sum": "$eventCost" },
        "acceptedCount": { "$sum": "$acceptedCount" },
        "acceptedCost": { "$sum": "$acceptedCost" }
    }}
])

In that way each array is reduced to just the values you need to collect and this makes the latter $group a lot easier.

Those are a couple of approaches with the latter being the better option, but if you are actually able to process both queries in parallel and combine them in a smart way, then running two queries as you are currently doing would be my recommended approach for the best performance.

Sign up to request clarification or add additional context in comments.

1 Comment

Wow, that's a good way to use the $first, Thanks for the answer . I am using the second way in the answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.