sort array in query and project all fields

Question

I would like to sort a nested array at query time while also projecting all fields in the document.

Example document:

{ "_id" : 0, "unknown_field" : "foo", "array_to_sort" : [ { "a" : 3, "b" : 4 }, { "a" : 3, "b" : 3 }, { "a" : 1, "b" : 0 } ] }

I can perform the sorting with an aggregation but I cannot preserve all the fields I need. The application does not know at query time what other fields may appear in each document, so I am not able to explicitly project them. If I had a wildcard to project all fields then this would work:

db.c.aggregate([
    {$unwind: "$array_to_sort"},
    {$sort: {"array_to_sort.b":1, "array_to_sort:a": 1}},
    {$group: {_id:"$_id", array_to_sort: {$push:"$array_to_sort"}}}
]);

...but unfortunately, it produces a result that does not contain the "unknown_field":

    {
        "_id" : 0,
        "array_to_sort" : [
            {
                "a" : 1,
                "b" : 0
            },
            {
                "a" : 3,
                "b" : 3
            },
            {
                "a" : 3,
                "b" : 4
            }
        ]
    }

Here is the insert command incase you would like to experiment:

db.c.insert({"unknown_field": "foo", "array_to_sort": [{"a": 3, "b": 4}, {"a": 3, "b":3}, {"a": 1, "b":0}]})

I cannot pre-sort the array because the sort criteria is dynamic. I may be sorting by any combination of a and/or b ascending/descending at query time. I realize I may need to do this in my client application, but it would be sweet if I could do it in mongo because then I could also $slice/skip/limit the results for paging instead of retrieving the entire array every time.

Neil Lunn · Accepted Answer · 2014-04-08 03:48:48Z

2

Since you are grouping on the document _id you can simply place the fields you wish to keep within the grouping _id. Then you can re-form using $project

db.c.aggregate([
    { "$unwind": "$array_to_sort"},
    { "$sort": {"array_to_sort.b":1, "array_to_sort:a": 1}},
    { "$group": { 
        "_id": {
            "_id": "$_id",
            "unknown_field": "$unknown_field"
        },
        "Oarray_to_sort": { "$push":"$array_to_sort"}
    }},
    { "$project": {
        "_id": "$_id._id",
        "unknown_field": "$_id.unknown_field",
        "array_to_sort": "$Oarray_to_sort"
    }}
]);

The other "trick" in there is using a temporary name for the array in the grouping stage. This is so when you $project and change the name, you get the fields in the order specified in the projection statement. If you did not, then the "array_to_sort" field would not be the last field in the order, as it is copied from the prior stage.

That is an intended optimization in $project, but if you want the order then you can do it as above.

For completely unknown structures there is the mapReduce way of doing things:

db.c.mapReduce(
    function () {
        this["array_to_sort"].sort(function(a,b) {
            return a.a - b.a || a.b - b.b;
        });

        emit( this._id, this );
    },
    function(){},
    { "out": { "inline": 1 } }
)

Of course that has an output format that is specific to mapReduce and therefore not exactly the document you had, but all the fields are contained under "values":

{
    "results" : [
            {
                    "_id" : 0,
                    "value" : {
                            "_id" : 0,
                            "some_field" : "a",
                            "array_to_sort" : [
                                    {
                                            "a" : 1,
                                            "b" : 0
                                    },
                                    {
                                            "a" : 3,
                                            "b" : 3
                                    },
                                    {
                                            "a" : 3,
                                            "b" : 4
                                    }
                            ]
                    }
            }
    ],
}

Future releases ( as of writing ) allow you to use a $$ROOT variable in aggregate to represent the document:

db.c.aggregate([
    { "$project": {
        "_id": "$$ROOT",
        "array_to_sort": "$array_to_sort"
    }},
    { "$unwind": "$array_to_sort"},
    { "$sort": {"array_to_sort.b":1, "array_to_sort:a": 1}},
    { "$group": { 
        "_id": "$_id",
        "array_to_sort": { "$push":"$array_to_sort"}
    }}
]);

So there is no point there using the final "project" stage as you do not actually know the other fields in the document. But they will all be contained (including the original array and order ) within the _id field of the result document.

edited Apr 8, 2014 at 3:48

answered Apr 6, 2014 at 2:01

Neil Lunn

151k36 gold badges356 silver badges327 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Bob B Over a year ago

@Neil_Lunn Thanks. The application does not know the name of the unknown_field at query time. I'm looking for a way to catch all of them - like a wildcard or include all type thing.

Neil Lunn Over a year ago

@BobB From MongoDB version 2.6 (which is currently a release candidate ) there is a "$$ROOT" variable that can be used to copy the whole document in it's form at a given pipeline stage. But you will not get the document back in the same form if you do not know the fields. The other option is to do this with "mapReduce" and arbitrary JavaScript. You would actually be better off re-considering your schema design to have at least some level of "uniform" behavior.

Neil Lunn Over a year ago

@BobB Added the mapReduce and future aggregate methods for reference.

Neil Lunn Over a year ago

@BobB perhaps you missed the comment that indicated the answer had been edited.

Bob B Over a year ago

@Neil_Lunn I am unable to test the $$ROOT feature but this sounds like it would do what I need.

|

Collectives™ on Stack Overflow

sort array in query and project all fields

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related