0

I have an issue to generate proper index for my mongo query, which would avoid SORT stage. I am not even sure if that is possible in my case. So here is my query with execution stats:

db.getCollection('test').find(
{
    "$or" : [
    {
        "a" : { "$elemMatch" : { "_id" : { "$in" : [4577] } } }, 
        "b" : { "$in" : [290] }, 
        "c" : { "$in" : [35, 49, 57, 101, 161, 440] }, 
        "d" : { "$lte" : 399 }
    },
    { 
        "e" : { "$elemMatch" : { "numbers" : { "$in" : ["1K0407151AC", "0K20N51150A"]  } } },
        "d" : { "$lte" : 399 }
     }] 
})
.sort({ "X" : 1, "d" : 1, "Y" : 1, "Z" : 1 }).explain("executionStats")

The fields 'm', 'a' and 'e' are arrays, that is why 'm' is not included in any index.

If you check the execution stats screenshot, you will see that memory usage is pretty close to maximum and unfortunately I had cases where the query failed to execute because of the 32MB limit.

Index for the first part of the $or query: { "a._id" : 1, "X" : 1, "d" : 1, "Y" : 1, "Z" : 1, "b" : 1, "c" : 1 }

Index for the second part of the $or query: { "e.numbers" : 1, "X" : 1, "d" : 1, "Y" : 1, "Z" : 1 }

The indexes are used by the query, but not for sorting. Instead of SORT stage I would like too see SORT_MERGE stage, but no success for now. If I run the part queries inside $or separately, they are able to use the index to avoid sorting in a memory. As a workaround it is ok, but I would need to merge and resort the results by the application.

MongoDB version is 3.4.2. I checked that and that question. My query is the result. Probably I missed something?

Edit: mongo documents look like that:

{
    "_id" : "290_440_K760A03",
    "Z" : "K760A03",
    "c" : 440,
    "Y" : "NPS",
    "b" : 290,
    "X" : "Schlussleuchte",
    "e" : [ 
        {
            "..." : 184,
            "numbers" : [ 
                "0K20N51150A"
            ]
        }
    ],
    "a" : [ 
        {
            "_id" : 4577,
            "..." : [ 
                {
                    "..." : [ 
                        {
                            "..." : "R",
                        }
                    ]
                }
            ]
        }, 
        {
            "_id" : 4578            
        }
    ],
    "d" : 101,
    "m" : [ 
        "AT", 
        "BR", 
        "CH"
    ],
    "moreFields":"..."
}

Edit 2: removed the filed "m" from query to decrease complexity and attached test collection dump for someone, who wants to help :)

7
  • According to the answer in the first question you link to, both indexes would need to end in "X" : 1, "d" : 1, "Y" : 1, "Z" : 1 for a SORT_MERGE, not just contain those fields. Commented Apr 10, 2017 at 13:51
  • @JohnnyHK, changed the first index to { "a._id" : 1, "X" : 1, "d" : 1, "Y" : 1, "Z" : 1} to try if that works, but without success. Sorting is still in memory, query just got slower. Commented Apr 10, 2017 at 14:34
  • can you please add one or two instances of data Commented Apr 10, 2017 at 17:11
  • @lovegupta, if you mean documents, then see updated question. the comment length is not enough. Commented Apr 11, 2017 at 10:24
  • Can you try creating below four indices and see if it works for you. It worked for me and I got SORT_MERGE. 1. {"m":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1} 2. {"a._id":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1} 3. {"m":1,"X":1,"d":1,"Y":1,"Z":1} 4. {"e.numbers":1,"X":1,"d":1,"Y":1,"Z":1} Commented Apr 11, 2017 at 12:52

1 Answer 1

0

Here is the solution- I just added one document in my test collection as shown in your question (edit part). Then I created below four indices-

 1. {"m":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1}
 2. {"a._id":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1}
 3. {"m":1,"X":1,"d":1,"Y":1,"Z":1}
 4. {"e.numbers":1,"X":1,"d":1,"Y":1,"Z":1}

And when I executed given query for execution stats then it shows me the SORT_MERGE state as expected.

Here is the explanation- MongoDB has a thing called equality-sort-range which tells a lot how we should create our indices. I just followed this rule and kept the index in that order. So Here the index should be {Equality fields, "X":1,"d":1,"Y":1,"Z":1, Range fields}. You can see that the query has range on field "d" only ("d" : { "$lte" : 101 }) but "d" is already covered in SORT fields of index ("X":1,"d":1,"Y":1,"Z":1) so we can skip range part (i.e. field "d") from the end of index.

If "d" had NOT been in sort/equality predicate then I would have taken it in index for range index field and my index would have looked like {Equality fields, "X":1,"Y":1,"Z":1,"d":1}.

Now my index is {Equality fields, "X":1,"d":1,"Y":1,"Z":1} and I am just concerned about equality fields. So to figure out equality fields I just checked the query find predicates and I found there are two conditions combined by OR operator.

  • The first condition has equality on "a._id", "b", "c", "m" ("d" has range, not equality). So I need to create an index like "a._id":1,"m":1,"b":1,"c":1,"X":1,"d":1,"Y":1,"Z":1 but this will give error because it has two array fields "a_id" and "m". And as we know Mongo doesn't allow compound index on parallel arrays so it will fail. So I created two separate index just to allow Mongo to use whatever is chosen by query planner. And hence I created first and second index.
  • The second condition of OR operator has "e.numbers" and "m". Both are arrays fields so I had to create two indices as done for first condition and that's how I got my third and fourth index.

Now we know that at a time a single query can use only and only one index so I need to create these indices because I don't know which branch of OR operator will be executed.

Note: If you are concerned about size of index then you can keep only one index from first two and one from last two. Or you can also keep all four and hint mongo to use proper index if you know it well before query planner.

Sign up to request clarification or add additional context in comments.

4 Comments

I edited my question and added a test collection. The index 1 and 3 can be removed now. Unfortunately, I still have SORT stage even if I change the index 2 as in your answer. With this index 2 you will have SORT_MERGE stage for the first part of the $or query ("a" : { "$elemMatch" : { "_id" : { "$in" : [4577] } } }, "b" : { "$in" : [290] }, "c" : { "$in" : [35, 49, 57, 101, 161, 440] }, "d" : { "$lte" : 399 }) which you can totally avoid if you use this index: { "a._id" : 1, "X" : 1, "d" : 1, "Y" : 1, "Z" : 1, "b" : 1, "c" : 1 }.
The comment size is not enough, so I will post second one:) You say the right thing about equality-sort-range, but maybe you misunderstand it a bit. Please read this article: blog.mlab.com/2012/06/cardinal-ins Anyway I want to thank you once again for your time.
Thanks stoos for giving that link and correcting me. Is your question still open or did it work as you expected?
unfortunately it is still open. I will continue trying to solve this issue after holydays. I will update the question if I find something new.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.