6

I have a document called 'InventoryPerDay' that contain inventory for stores for each day:

{
  _id: "20131202/store_a",
  _metadata: {
    date: ISODate("2013-12-02T00:00:00Z"),
    store: "store_a"
  },
  inventory: {
    quantity: {
      item_44: 1350,
      item_32: 1,
      item_2: 1,
      item_9: 1
    }
  }
},
{
  _id: "20131201/store_a",
  _metadata: {
    date: ISODate("2013-12-01T00:00:00Z"),
    store: "store_a"
  },
  inventory: {
    quantity: {
      item_44: 1000,
      item_32: 5,
      item_2: 10
    }
  }
}

I need the total quantity of each item in store_a for both of the days. The items in the 'quantity' hash are unknown. You can see that 'item_9' exists for 02/12/2013 but not for 01/12/2013.

How can i sum unknown nested keys in multiple documents using aggregation in mongodb?

The result for the above example should be:

{
  store: "store_a",
  inventory: {
    quantity: {
      item_44: 2350,
      item_32: 6,
      item_2: 11,
      item_9: 1
    }
  }
}
1

1 Answer 1

9

Unfortunately what you are seeking to do is not possible with the current capabilities of Mongodb (not with aggregation at least), and if you keep your current schema you will have to use map-reduce to do this which will be much slower.

You can refer here for an explanation, why your schema is not optimal.

You can refer here to understand how you should make your schema and how your aggregation should look like.

Also what you are asking would be possible with your current schema if you could project the names of the fields to values so maybe you can vote for this ticket so it gets more attention.


Update

You need to change your schema as such

{
  _id: "20131202/store_a",
  _metadata: {
    date: ISODate("2013-12-02T00:00:00Z"),
    store: "store_a"
  },
  inventory: {
    quantities: [
      { k : "item_44", v: 1350},
      { k : "item_32", v: 1},
      { k : "item_2", v: 1},
      { k : "item_9", v: 1},
    ]
  }
},
.
.
.

and the query should look like this

db.InventoryPerDay.aggregate(
 [
    {
        "$unwind" : "$inventory.quantities"
    },
    {
        "$group" : {
            "_id" : { "store": "$_metadata.store", "item" : "$inventory.quantities.k"},
            "total" : {
                "$sum" : "$inventory.quantities.v"
            }
        }
    }
])

This will give you a result like this

{
  result :
   [
    { "_id" : { store : "store_a", item : "item_44"}
      "total" : 2350
    },
    { "_id" : { store : "store_a", item : "item_32"}
      "total" : 6
    },
    { "_id" : { store : "store_a", item : "item_2"}
      "total" : 11
    },
    { "_id" : { store : "store_a", item : "item_9"}
      "total" : 1
    }
  ]
}

You can format these results with the $project operator at the end of the aggregation pipeline.

And regarding the map-reduce vs aggregation: Map reduce is much slower than aggregation and the main reason is because it is executed on a single thread. You can vote for this ticket so they implement it for multiple cores, in most cases, it takes orders of magnitude more time to compute with map-reduce vs aggregation.

Sign up to request clarification or add additional context in comments.

4 Comments

That's what I was afraid of... Can you put an example of the new structure according to my example?
Also, In case i'm changing the structure, will aggregation framework will still be faster than map-reduce ?
My answer had some mistakes which i corrected so if you tested it and it didn't work check the edits to see what i changed.
I think this question is similar to this one stackoverflow.com/questions/56640054/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.