5

I am trying to write $in query with $regex in mongo+java. It's not working in mongo shell either. What I mean is I don't get any results but no query parse error either. Here's the final query I got from Java Debugger at the line where I say collection.find(finalQuery)

{"$and": [ 
    {"$or": [
        {"country": "united states"}
    ]}, 
    {"businesses": {
        "$in": [
            {"$regex": "^.*cardinal.*health.*$"},
            {"$regex": "^.*the.*hartford.*$"}
        ]
    }}
]}

Java Code snipet for Above query:

Set<Pattern> businesses = new HashSet<Pattern>();
            
for(String st: srchTerms) {

    businesses.add(Pattern.compile("^"+st.trim()+"$"));

}
srchTermQuery.append("businesses", new BasicDBObject("$in", businesses));

However, following query works in mongo shell but I don't know how to write it into java:

{"registering_organization": {
    "$in": [
        /^.*cardinal.*health.*$/,
        /^.*the.*hartford.*$/
    ]
}}

Java Code add double quotes around regex if we try to define it as a string.

2
  • Yes I can reproduce this problem on mongodb 2.4.5 via shell. I suggest file a bug on their JIRA site: jira.mongodb.org Commented Aug 1, 2013 at 4:18
  • right, I forgot to mention version. Mine is mongodb 2.4.5 as well. Commented Aug 1, 2013 at 16:24

4 Answers 4

4

The behavior you're seeing might be a bug, however as an alternative you can write your query like this

Pattern pattern = Pattern.compile("(^aaa$)|(^bbb$)");
srchTermQuery.append("businesses", pattern);

Not pretty but it seem to do the trick

Sign up to request clarification or add additional context in comments.

1 Comment

however one regular expression with mulitple "|" seems to performing worse than $in query. I can send explain() in interested. I have also tried $Or query instead of $in, which also seems to be performing worse than $in :(
3

You're not going to be able to convert:

{"businesses" : {
    "$in":[
        /^.*cardinal.*health.*$/,
        /^.*the.*hartford.*$/
    ]
}}

directly into a Java regex. This is not a bug, it's because the Java driver uses $regex format when creating regex queries to avoid any ambiguity.

The $regex documentation states that

db.collection.find({field: /acme.*corp/});
db.collection.find({field: {$regex: 'acme.*corp'}});

So your Java-generated query of:

{"businesses": {
    "$in": [
        {"$regex": "^.*cardinal.*health.*$"}, 
        {"$regex": "^.*the.*hartford.*$"}
    ]
}}

is exactly equivalent of the query you were trying to convert:

{"businesses": {
    "$in": [
        /^.*cardinal.*health.*$/,
        /^.*the.*hartford.*$/
    ]
}}

In summary, the Java you've written is already the correct way to convert the query you wanted. I've run it in my own test and it returns the expected results.

Perhaps if you included some sample documents that you expect to be returned by the query we could help further?

2 Comments

I still cant seem to get result from $in and regex combination. Here's the sample dataset that I get from similar query <br/> { "$or" : [ { "registering_organization" : { "$regex" : "^.*cardinal.*health.*$"}} , { "registering_organization" : { "$regex" : "^.*the.*hartford.*$"}} ]} sample data: <br/> { "country" : "united states", "registering_organization" : "the hartford-070531ads308"}, { "country" : "united states", "registering_organization" : "thehartfordpayerw9809523465"}
The $in operator doesn't support $regex. You can see here: docs.mongodb.com/manual/reference/operator/query/in/…
2

I had a need to list all keys beginning with a specified string. The following worked for me in CLI:

db.crawlHTML.count({"_id": /^1001/})

The following was the Java implementation:

public List<String> listKeysLike(DB mongoDB, String likeChars) throws Exception {

    DBCollection dbCollection = this.getHTMLCollection(mongoDB, TESTPROD);
    List<String> keyList = new ArrayList<String>();

    BasicDBObject query = new BasicDBObject();      
    String queryString = "^" + likeChars.trim() ;   // setup regex
    query.put("_id", java.util.regex.Pattern.compile(queryString));  
    DBCursor cursor = dbCollection.find(query);

    while (cursor.hasNext()) {      // _id used as the primary key
        BasicDBObject obj = (BasicDBObject) cursor.next();
        String tempString = obj.getString("_id");
        keyList.add(tempString);
    }       // while

    return keyList;
}

NB: The "TESTPROD" just tells me which of two databases I should be using.

Comments

1

You have to use mongodb regex notation rather than putting it in a string

db.somecollection.find({records: {$in: [/.*somestring.*/]}})

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.