0

Okay, so I've been searching for a while but couldn't find an answer to this, and I am desperate :P

I have some documents with this syntax

{
    "period": ISODate("2018-05-29T22:00:00.000+0000"),
    "totalHits": 13982
    "hits": [
        {
            // some fields...
            users: [
                { 
                    // some fields...
                    userId: 1,
                    products: [
                        { productId: 1, price: 30 },
                        { productId: 2, price: 30 },
                        { productId: 3, price: 30 },
                        { productId: 4, price: 30 },
                    ]
                },
            ]
        }
    ]
}

And I want to retrieve a count of how many products (Independently of which user has them) we have on a period, an example output would be like this:

[
    {
        "period": ISODate("2018-05-27T22:00:00.000+0000"),
        "count": 432
    },
    {
        "period": ISODate("2018-05-28T22:00:00.000+0000"),
        "count": 442
    },
    {
        "period": ISODate("2018-05-29T22:00:00.000+0000"),
        "count": 519
    }
]

What is driving me crazy is the "object inside an array inside an array" I've done many aggregations but I think they were simpler than this one, so I am a bit lost.

I am thinking about changing our document structure to a better one, but we have ~6M documents which we would need to transform to the new one and that's just a mess... but Maybe it's the only solution.

We are using MongoDB 3.2, we can't update our systems atm (I wish, but not possible).

1
  • If you want that "per user id" then you have no other option than to $unwind each array whatever the version. Newer versions will not do anything for you here and the issue seems more with nesting arrays where you should really have a much flatter structure. Commented May 31, 2018 at 11:36

1 Answer 1

1

You can use $unwind to expand your array, then use $group to sum:

db.test.aggregate([
    {$match: {}}, 
    {$unwind: "$hits"}, 
    {$project: {_id: "$_id", period: "$period", users: "$hits.users"}}, 
    {$unwind: "$users"}, 
    {$project: {_id: "$_id", period: "$period", subCout: {$size: "$users.products"}}}, 
    {$group: {"_id": "$period", "count": {$sum: "$count"}}}
])
Sign up to request clarification or add additional context in comments.

4 Comments

I've read unwind is not optimal for huge collections, anyway it takes super low to get the count, I need to check the results but it gives me a better result than aggregations I've tried, so I'll give you my feedback as soon as we test stuff, but looks promising :D (I will carve this aggr on my brain)
It looks like it is doing a SUM of all the products and not only the ones matching that ID, for example in our old data we have a count of 180 for a product, with this query I am getting a count of 8641, which is really huge and looks like the sum of every product (because our stock etc). I tried with $filter but I am not really good with it so is doing nothing :P (sum is always 0)
You have to place your filter here {$match: {}}, and tailor as your need. If you want specific prodduct, place a $match after each projection.
Yeah, I placed my filter inside that match, {"hits.users.products.productId": 1} I think the $sum $size is the problem because the size of the products array doesn't only contains one product, but a bunch of them, I think it counts the size of the array if the document contains that product, not doing +1 to the count everytime that exact product inside hits exists (I think, I am going off now, I'll check again tomorrow morning) ^^

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.