156

I am trying to perform a regex query using PyMongo against a MongoDB server. The document structure is as follows

{
  "files": [
    "File 1",
    "File 2",
    "File 3",
    "File 4"
  ],
  "rootFolder": "/Location/Of/Files"
}

I want to get all the files that match the pattern *File. I tried doing this as such

db.collectionName.find({'files':'/^File/'})

Yet I get nothing back. Am I missing something, because according to the MongoDB docs this should be possible? If I perform the query in the Mongo console it works fine, does this mean the API doesn't support it or am I just using it incorrectly?

4 Answers 4

216

If you want to include regular expression options (such as ignore case), try this:

import re
regx = re.compile("^foo", re.IGNORECASE)
db.users.find_one({"files": regx})
Sign up to request clarification or add additional context in comments.

3 Comments

Note also that regex's anchored at the start (ie: starting with ^) are able to use indexes in the db, and will run much faster in that case.
Regex's starting with ^ can only use an index in certain cases. When using re.IGNORECASE I believe mongo can't use an index to perform the query.
Is this usage documented somewhere? I can't find this in the official pymongo API doc.
185

Turns out regex searches are done a little differently in pymongo but is just as easy.

Regex is done as follows :

db.collectionname.find({'files':{'$regex':'^File'}})

This will match all documents that have a files property that has a item within that starts with File

8 Comments

Actually, what you have here is also the way it's done in javascript (and probably other languages too) if you use $regex. @Eric's answer is the python way that's a little different.
what's the difference? They're both using python pymongo correct? It is part of mongodb queries so I don't see the issue really.
Ignorecase is possible in regex of mongodb JScript also viz. db.collectionname.find({'files':{'$regex':'^File','$options':'i'}})
This answer looks better to my eyes. Why bother compiling a Python RE if you're just going to stringify it so that Mongo can compile it again? Mongo's $regex operator takes an $options argument.
Please use r'^File' instead of '^File' to avoid other problem
|
18

To avoid the double compilation you can use the bson regex wrapper that comes with PyMongo:

>>> regx = bson.regex.Regex('^foo')
>>> db.users.find_one({"files": regx})

Regex just stores the string without trying to compile it, so find_one can then detect the argument as a 'Regex' type and form the appropriate Mongo query.

I feel this way is slightly more Pythonic than the other top answer, e.g.:

>>> db.collectionname.find({'files':{'$regex':'^File'}})

It's worth reading up on the bson Regex documentation if you plan to use regex queries because there are some caveats.

1 Comment

If you need to match agains an array using $in then $regex would not work for you. bson.regex.Regex will do the trick!
10

The solution of re doesn't use the index at all. You should use commands like:

db.collectionname.find({'files':{'$regex':'^File'}})

( I cannot comment below their replies, so I reply here )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.