45

I've got a MongoDB collection where a particular string may appear in any of a number of fields:

{"_id":1, "field1": "foo", "field2": "bar", "field3": "baz", "otherfield": "stuff"},
{"_id":2, "field1": "bar", "field2": "baz", "field3": "foo", "otherfield": "morestuff"},
{"_id":3, "field1": "baz", "field2": "foo", "field3": "bar", "otherfield": "you get the idea"}

I need to query so that I am returned all records where any one of a set of fields is equal to any value in an array ... basically, if I have ["foo","bar"] I need it to match if either of those strings are in field1 or field2 (but not any other field).

Obviously I can do this with a series of multiple queries

db.collection.find({"field1":{"$in":["foo","bar"]}}) db.collection.find({"field2":{"$in":["foo","bar"]}})

etc., and I've also made a very large $or query that concatenates them all together, but it seems far too inefficient (my actual collection needs to match any of 15 strings that can occur in any of 9 fields) ... but I'm still new to nosql DBs and am not sure of the best paradigm I need to use here. Any help is greatly appreciated.

3 Answers 3

79

try

db.collection.find(
    // Find documents matching any of these values
    {$or:[
        {"field1":{"$in":["foo","bar"]}},
        {"field2":{"$in":["foo","bar"]}}
    ]}
)

also refer to this question

Sign up to request clarification or add additional context in comments.

8 Comments

jlmcdonald has said that he tried to concatenate the queries to a large $or query...
Right ... and this works, but there's got to be a more efficient way than this.
you mean by efficient... an efficient way to write it ... or an efficient way that mongo executes that query...
That's a good question ... as I mentioned, I've got 15 strings in any of 9 fields, so it makes for a pretty ugly looking $or query when written, so having something more readable would be a small benefit. But my biggest concern is with the expense on the DB. I'm assuming that a concatenated $or query is performing each query and then aggregating results but maybe that's wrong and $or isn't terribly expensive?
as for the looking i don't know a better way for the query to look better... but as for the expense to the database all of this join/filter/union queries used inside the find are called Criteria and this criteria is lazy loaded.. so the find bigger query only hits the database once returning a cursor... so there is no worry about the performance... also if you want the query not to hit the database at all... use where instead of find this way you can get the Criteria ... without hitting the db... and you can evaluate the returned criteria when needed (this will hit the db)
|
16

Found another answer through poring over the documentation that seems to hit a sweet spot -- text indexes.

db.collection.ensureIndex({"field1":"text","field2":"text"})
db.records.runCommand("text",{search:"foo bar"})

When I run my actual query with many more strings and fields (and about 100,000 records), the $or/$in approach takes 620 milliseconds while the text index takes 131 milliseconds. The one drawback is that it returns a different type of document as a result; luckily the actual documents are a parameter of each result object.

Thanks to those who took the time to make suggestions.

Comments

0

I would collect all the relevant fields in one field (i.e. collected) by adding their values like

"foo:field1", 
"bar:field2", 
"baz:field3",  
"stuff:otherfield", 
"bar:field1", 
"baz:field2"
...

into that field.

If you search for bar existing in any field you can use:

db.collection.find( { collected: { $regex: "^bar" } }, ... );

Your example in the question would look like:

db.collection.find( collected: { { $all: [ "foo:field1", "foo:field2", "bar:field1", "bar:field2" ] } }, ... ); 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.