1

Scenario: I have to aggregate the latest version of balance of all the customers in a particular branch of a bank

Document in mongo that is to be aggregated

{
    "_id" : {
        "AccountNumber" : "123",
        "branchId" : "AXC",
        "@objectName" : "AccountBalance"
    },
    "Versions" : [ 
        {
            "value" : NumberDecimal("96562.88"),
            "version" : NumberLong(1)
        },
            {
            "value" : NumberDecimal("9612.88"),
            "version" : NumberLong(2)
        }
    ]
}

I tried this but returns 0 for the result:

db.getCollection('AccountInfo').aggregate([
  { "$project": { "Versions": { "$slice": [ "$Versions", -1 ] } } },
  { "$match": {    
    "_id.@objectName" : "AccountBalance",
  }},
  { "$group": { "_id": "$_id.branchId", "total": { "$sum": "$Versions.value" } } },
  { "$sort": { "total": -1 } }
])

Any help is appreciated.

2 Answers 2

1

You were not far off, the operation you really want is $arrayElemAt instead,

db.getCollection('AccountInfo').aggregate([
  { "$match": { "_id.@objectName" : "AccountBalance" }},
  { "$group": {
    "_id": "$_id.branchId",
    "total": { 
      "$sum": {
        "$arrayElemAt": [ "$Versions.value", -1 ]
      }
    }
  }},
  { "$sort": { "total": -1 } }
])

The $slice returns an "array" so you still need to $sum the elements:

db.getCollection('AccountInfo').aggregate([
  { "$match": { "_id.@objectName" : "AccountBalance" }},
  { "$group": {
    "_id": "$_id.branchId",
    "total": { 
      "$sum": {
        "$sum": { "$slice": [ "$Versions.value", -1 ] }
      }
    }
  }},
  { "$sort": { "total": -1 } }
])

But it's generally better to get the single element when that is what you really mean. Only use $slice where you actually mean "multiple" array elements.

If you did not know for certain that the "version" was the "last" array item, then you can match with $indexOfArray and $max:

db.getCollection('AccountInfo').aggregate([
  { "$match": { "_id.@objectName" : "AccountBalance" }},
  { "$group": {
    "_id": "$_id.branchId",
    "total": { 
      "$sum": {
        "$arrayElemAt": [
          "$Versions.value",
          { "$indexOfArray": [
            "$Versions.version",
            { "$max": "$Versions.version" }
          ]}
        ]
      }
    }
  }},
  { "$sort": { "total": -1 } }
])

Also learn to "always" $match first and don't $project elements that you can do the same thing "inline" within the $group. That makes your query far more efficient.

All return the same result:

{ "_id" : "AXC", "total" : NumberDecimal("9612.88") }
Sign up to request clarification or add additional context in comments.

1 Comment

It works! Thank you for the the optimal solution and +1 for the understanding and detailed breakdown of the problem at hand.
1

If the Versions array can have the records in any order then if the last element has versions.value lesser than any previous element , the aggregation might not be correct(in short if the versions array is not order on the versions.value field , the last element will not give the correct answer). The following aggregation does not rely upon the position of the array elements , instead sorts the array elements based on the versions.value field for the combined key , account number and branchid.

    db.bank.aggregate([{"$match":{"_id.@objectName":"AccountBalance"}},{"$unwind":{"path":"$Versions"}},{"$sort":{"Versions.version":-1}},{"$group":{"_id":{"accno":"$_id.AccountNumber","branchid":"$_id.branchId"},"value":{"$first":"$Versions.value"}}},{"$group":{"_id":"$_id.branchid","total":{"$sum":"$value"}}}])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.