0

I'm using MongoDB to store a bunch of information, and need to search an indexed array for values.

Here's the schema:

{ "common_name" : { "name" : "thename", "type" : "thetype" } } 

There are other values in the document, but this is the only one I'm searching.

So, I figured something like this would work: (in the shell)

db.collection.find({common_name:{$in:"thename"}})

But I get nothing back.

This looks like exactly like what is being done here: http://www.php.net/manual/en/mongo.queries.php - But I can't seem to get anything back.

I've tried

db.collection.find({"common_name.name":"thename"}})

and it works as expected, but to search multiple nodes on the index (up to 4), it could get ugly, basically defining an $or index for each subcategory, and quadrupling my query time. Being that this is powering an autocompleter, I can't do that.

Oddly enough, the following doesn't return any documents either:

db.collection.find({"common_name":{"name":"thename"}})

Which to my understanding, is exactly the same thing as the above query.

I'm pretty new to Mongo, so maybe I'm missing something big here?

Any ideas as to how to get the fastest access to this data (using an anchored regex)?

I could just use a relational table, but doesn't that defeat the purpose of a NoSQL system like Mongo?

2 Answers 2

2

The problem is that you don't have an array. An array would look like { "common_name" : [ "name1", "name2", "name3" ] }

With a structure like that, Mongo can index the elements of the array in a single index and quickly find any document that contains any single item from the array.

You could use an array and simply define the offsets to be what you want, e.g. [0] is the name and [1] is the type etc. With that in place you can now find any document quickly.

You could do this in addition to keeping the data in the named fields (but clearly that would increase storage so isn't recommended).

If your autocomplete doesn't need to be totally up-to-date with the latest documents added to the collection you could instead run a map-reduce job to build a separate collection that you use just for autocomplete. The map step would emit for each of the individual name fields in a document (so 4x as many documents after emit) and the reduce step would count up how many of each name there were (so 1x the number of unique names across all fields). You could index the resulting collection by name and count which would allow your autocomplete to recommend the most common names first.

Sign up to request clarification or add additional context in comments.

3 Comments

Ah, I was worried that that might be the issue. I'll change my schema to work that way. Either I misunderstood the php docs here php.net/manual/en/mongo.queries.php, or they are incorrect. Thanks!
Wow, just ran some explain()s after converting it to an array, searching that array with an anchored regex costs 55ms, after indexing it, it's down to 2. That's pretty amazing.
@Jesse - I forgot an index once. Didn't notice until I had nearly a million documents in my collection! Added an index and it went right back to a couple of milliseconds.
1

db.collection.find({"common_name":{"name":"thename"}}) Which to my understanding, is exactly the same thing as the above query.

No it's not because this would only match if the value of common_name would be just name: thename}, you want the $elemMatch operator.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.