0

Suppose I have this following dataset,

//listings collection

{
  "city": New York,
  "beds": 2,
  "build": concrete
}

{
  "city": New York,
  "beds": 4,
  "build": wood
}

{
  "city": New York,
  "beds": 3,
  "build": asphalt
}

{
  "city": New York,
  "beds": 1,
  "build": concrete
}

I can get the number of following averages of beds with the following query

        db.listings.aggregate(
            [  
                {  
                    $match: {  
                        "city": "New York"
                    }
                },
                {  
                    $group: {  
                        "_id": null,
                        "Avg-Beds": {  
                            $avg:"$beds
                         }
                    }
                }
            ])

Which is cool, but what I'm really looking for is something like

{
    "Avg-Beds": 2
    "Build" {
                "Asphalt" : 1,
                "Wood": 1,
                "Concrete": 2

}

In summary, I want to average the beds, but I want to count the output of "build" field at the same time. How is this achievable with mongodb?

Even better would be something like an output of

"Build": {
           "Asphalt": "25%"
 }

Which would give a percentage based value. Note that I do not have a predefined set of "build" output fields.

1 Answer 1

1

You can try below aggregation:

db.listings.aggregate([
    {
        $match: { "city": "New York" }
    },
    {
        $group: {
            _id: null,
            avg: { $avg: "$beds" },
            docs: { $push: "$$ROOT" }
        }
    },
    {
        $unwind: "$docs"
    },
    {
        $group: {
            _id: "$docs.build",
            avg: { $first: "$avg" },
            beds: { $sum: "$docs.beds" }
        }
    },
    {
        $group: {
            _id: null,
            avg: { $first: "$avg" },
            total: { $sum: "$beds" },
            Build: { $push: { k: "$_id", v: "$beds" } }
        }
    },
    {
        $addFields: {
            Build: {
                $map: {
                    input: "$Build",
                    as: "b",
                    in: {
                        k: "$$b.k",
                        v: { $divide: [ "$$b.v", "$total" ] }
                    }
                }
            }
        }
    },
    {
        $project: {
            _id: 0,
            avg: 1,
            Build: { $arrayToObject: "$Build" }
        }
    }
])

The thing is that you need multiple independent aggregations so you can perform first one ($avg) and then embed its result into each document of your collection (pushing all the documents under docs field and the unwinding that field). Then you can build an array of k-v pairs to apply $arrayToObject to represent percentages.

As a result you'll get:

{ "avg" : 2.5, "Build" : { "asphalt" : 0.3, "wood" : 0.4, "concrete" : 0.3 } }
Sign up to request clarification or add additional context in comments.

3 Comments

Would it be simpler to not calculate the percentage? but rather a simple count? I can live with calculating percentages outside of a query.
All you have to do is replace v: { $divide: [ "$$b.v", "$total" ] } with v: "$$b.v" to get counts instead of percentage
K. This rather looks somewhat complex for a simple concept. Not sure if this will be friendly to my developers. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.