55

I've got two collections in my Mongo database, and the Foos contain references to one or more Bars:

Foo: { 
  prop1: true,
  prop2: true,
  bars: [
     {
     "$ref": "Bar",
     "$id": ObjectId("blahblahblah")
     }
  ]
}

Bar: {
   testprop: true
}

What I want is to find all of the Foos that have at least one Bar that has its testprop set to true. I've tried this command, but it doesn't return any results:

db.Foo.find({ "bars.testprop" : { "$in": [ true ] } })

Any ideas?

1
  • You cannot do join queries between two collections with find(). Commented Mar 8, 2012 at 17:26

5 Answers 5

91

You can now do it in Mongo 3.2 using $lookup

$lookup takes four arguments

from: Specifies the collection in the same database to perform the join with. The from collection cannot be sharded.

localField: Specifies the field from the documents input to the $lookup stage. $lookup performs an equality match on the localField to the foreignField from the documents of the from collection.

foreignField: Specifies the field from the documents in the from collection.

as: Specifies the name of the new array field to add to the input documents. The new array field contains the matching documents from the from collection.

db.Foo.aggregate(
  {$unwind: "$bars"},
  {$lookup: {
    from:"bar",
    localField: "bars",
    foreignField: "_id",
    as: "bar"

   }},
   {$match: {
    "bar.testprop": true
   }}
)
Sign up to request clarification or add additional context in comments.

4 Comments

This answer needs explanation.
how can I see the result of the above query in mongodb shell? @sidgate
@Pat you should change this to be the accepted answer
Wouldn't this need localField: "bars.$id", given the OP's data structure?
21

You can't. See http://www.mongodb.org/display/DOCS/Database+References

You have to do it in the client.

6 Comments

Well... crap. That sucks. Thanks.
@Pat did you every find an answer to your question? If so, can you provide the code, i'm struggling a bit with this
@brianScroggins The answer was basically "can't do it." Instead, I ended up reworking how I had modeled my data. Unfortunately, I don't still have the code handy.
The answer is generally to de-normalize the data into sub-elements of the parent object. It adds the overhead of maintaining the de-normalized records, but allows for this query pattern.
@Pat you can now query in 3.2 Look at my answer below
|
6

We have had a similar issue as we use MongoDB (3.4.4, actually 3.5.5 for testing) in combination with Morphia where we use @Referenece on a couple of entities. We are though not that happy with this solution and are considering removing these declarations and instead do the reference lookups manually.

I.e. we have a company collection and a user collection. The user entity in Morphia contains a @Refrence declaration on a company entity. The respective company collections contains entries like:

/* 1 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5dee"),
    "name" : "Test",
    "gln" : "1234567890123",
    "uuid" : "f1f86961-e8d5-40bb-9d3f-fdbcf549066e",
    "creationDate" : ISODate("2017-09-01T09:14:41.551Z"),
    "lastChange" : ISODate("2017-09-01T09:14:41.551Z"),
    "version" : NumberLong(1),
    "disabled" : false
}

/* 2 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5def"),
    "name" : "Sample",
    "gln" : "3210987654321",
    "uuid" : "fee69ee4-b29c-483b-b40d-e702b50b0451",
    "creationDate" : ISODate("2017-09-01T09:14:41.562Z"),
    "lastChange" : ISODate("2017-09-01T09:14:41.562Z"),
    "version" : NumberLong(1),
    "disabled" : false
}

while the user collections contains the following entries:

/* 1 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df0"),
    "userId" : "admin",
    "userKeyEncrypted" : {
        "salt" : "78e0528db239fd86",
        "encryptedAttribute" : "e4543ddac7cca9757721379e4e70567bb13956694f473b73f7723ac2e2fc5245"
    },
    "passwordHash" : "$2a$10$STRNORu9rcbq4qYUMld4G.HJk8QQQQBmAswSNC/4PBn2bih0BvjM6",
    "roles" : [ 
        "ADMIN"
    ],
    "company" : {
        "$ref" : "company",
        "$id" : ObjectId("59a92501df01110fbb6a5dee")
    },
    "uuid" : "b8aafdcf-d5c4-4040-a96d-8ab1a8608af8",
    "creationDate" : ISODate("2017-09-01T09:14:41.673Z"),
    "lastChange" : ISODate("2017-09-01T09:14:41.765Z"),
    "version" : NumberLong(1),
    "disabled" : false
}

/* 2 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df1"),
    "userId" : "sample",
    "userKeyEncrypted" : {
        "salt" : "e3ac48695dea5f51",
        "encryptedAttribute" : "e804758b0fd13c219c3fc383eaa9267b70f7b8a1ed74f05575add713ce11804a"
    },
    "passwordHash" : "$2a$10$Gt2dq1vy4J9MeqDnXjokAOtvFcvbhe/g9wAENXFPaPxLAw1L4EULG",
    "roles" : [ 
        "USER"
    ],
    "company" : {
        "$ref" : "company",
        "$id" : ObjectId("59a92501df01110fbb6a5def")
    },
    "uuid" : "55b62d4c-e5ee-408d-80c0-b79e02085b02",
    "creationDate" : ISODate("2017-09-01T09:14:41.873Z"),
    "lastChange" : ISODate("2017-09-01T09:14:41.878Z"),
    "version" : NumberLong(1),
    "disabled" : false
}

/* 3 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df2"),
    "userId" : "user",
    "userKeyEncrypted" : {
        "salt" : "ab9df671340a7d8b",
        "encryptedAttribute" : "7d8ad4ca6ad88686d810c70498407032f1df830596f72d931880483874d9cce3"
    },
    "passwordHash" : "$2a$10$0FLFw3ixW79JIBrD82Ly6ebOwnEDliS.e7GmrNkFp2nkWDA9OE/RC",
    "uuid" : "d02aef94-fc3c-4539-a22e-e43b8cd78aaf",
    "creationDate" : ISODate("2017-09-01T09:14:41.991Z"),
    "lastChange" : ISODate("2017-09-01T09:14:41.995Z"),
    "version" : NumberLong(1),
    "disabled" : false
}

In order to create a special company user view we also wanted to dereference the company in the user and only include selected fields. Based on a comment within a bug report we learned that MongoDB provides a $objectToArray: "$$ROOT.element" operation which basically splits fields of the given elements into key and value pairs. Note that $objectToArray operation was added in MongoDB version 3.4.4!

An aggregation on the company element contained in the user collection using the $objectToArray operation may look like below:

dp.user.aggregate([{ 
    $project: { 
        "userId": 1, 
        "userKeyEncrypted": 1, 
        "uuid":1, 
        "roles": 1, 
        "passwordHash": 1, 
        "disabled": 1, 
        company: { $objectToArray: "$$ROOT.company" }
    } 
}])

The result of above aggregation looks like this:

/* 1 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df0"),
    "userId" : "admin",
    "userKeyEncrypted" : {
        "salt" : "78e0528db239fd86",
        "encryptedAttribute" : "e4543ddac7cca9757721379e4e70567bb13956694f473b73f7723ac2e2fc5245"
    },
    "passwordHash" : "$2a$10$STRNORu9rcbq4qYUMld4G.HJk8QQQQBmAswSNC/4PBn2bih0BvjM6",
    "roles" : [ 
        "ADMIN"
    ],
    "uuid" : "b8aafdcf-d5c4-4040-a96d-8ab1a8608af8",
    "disabled" : false,
    "company" : [ 
        {
            "k" : "$ref",
            "v" : "company"
        }, 
        {
            "k" : "$id",
            "v" : ObjectId("59a92501df01110fbb6a5dee")
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df1"),
    "userId" : "sample",
    "userKeyEncrypted" : {
        "salt" : "e3ac48695dea5f51",
        "encryptedAttribute" : "e804758b0fd13c219c3fc383eaa9267b70f7b8a1ed74f05575add713ce11804a"
    },
    "passwordHash" : "$2a$10$Gt2dq1vy4J9MeqDnXjokAOtvFcvbhe/g9wAENXFPaPxLAw1L4EULG",
    "roles" : [ 
        "USER"
    ],
    "uuid" : "55b62d4c-e5ee-408d-80c0-b79e02085b02",
    "disabled" : false,
    "company" : [ 
        {
            "k" : "$ref",
            "v" : "company"
        }, 
        {
            "k" : "$id",
            "v" : ObjectId("59a92501df01110fbb6a5def")
        }
    ]
}

/* 3 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df2"),
    "userId" : "user",
    "userKeyEncrypted" : {
        "salt" : "ab9df671340a7d8b",
        "encryptedAttribute" : "7d8ad4ca6ad88686d810c70498407032f1df830596f72d931880483874d9cce3"
    },
    "passwordHash" : "$2a$10$0FLFw3ixW79JIBrD82Ly6ebOwnEDliS.e7GmrNkFp2nkWDA9OE/RC",
    "uuid" : "d02aef94-fc3c-4539-a22e-e43b8cd78aaf",
    "disabled" : false,
    "company" : null
}

Now it's simply a matter of filtering unwanted stuff (i.e. users that have no company assigned and selecting the right array entries) in order to feed the $lookup operation @sidgate has already explained and copy the value of the dereferenced company into the user response.

I.e. an aggregation like the one below will perform an join and add the data of the company to users that have a company assigned as the as value defined in the lookup:

db.user.aggregate([
    { $project: { "userId": 1, "userKeyEncrypted": 1, "uuid":1, "roles": 1, "passwordHash": 1, "disabled": 1, company: { $objectToArray: "$$ROOT.company" }} }, 
    { $unwind: "$company" }, 
    { $match: { "company.k": "$id"}  }, 
    { $lookup: { from: "company", localField: "company.v", foreignField: "_id", as: "company_data" } }
])

The result to the above aggregation can be seen below:

/* 1 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df0"),
    "userId" : "admin",
    "userKeyEncrypted" : {
        "salt" : "78e0528db239fd86",
        "encryptedAttribute" : "e4543ddac7cca9757721379e4e70567bb13956694f473b73f7723ac2e2fc5245"
    },
    "passwordHash" : "$2a$10$STRNORu9rcbq4qYUMld4G.HJk8QQQQBmAswSNC/4PBn2bih0BvjM6",
    "roles" : [ 
        "ADMIN"
    ],
    "uuid" : "b8aafdcf-d5c4-4040-a96d-8ab1a8608af8",
    "disabled" : false,
    "company" : {
        "k" : "$id",
        "v" : ObjectId("59a92501df01110fbb6a5dee")
    },
    "company_data" : [ 
        {
            "_id" : ObjectId("59a92501df01110fbb6a5dee"),
            "name" : "Test",
            "gln" : "1234567890123",
            "uuid" : "f1f86961-e8d5-40bb-9d3f-fdbcf549066e",
            "creationDate" : ISODate("2017-09-01T09:14:41.551Z"),
            "lastChange" : ISODate("2017-09-01T09:14:41.551Z"),
            "version" : NumberLong(1),
            "disabled" : false
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df1"),
    "userId" : "sample",
    "userKeyEncrypted" : {
        "salt" : "e3ac48695dea5f51",
        "encryptedAttribute" : "e804758b0fd13c219c3fc383eaa9267b70f7b8a1ed74f05575add713ce11804a"
    },
    "passwordHash" : "$2a$10$Gt2dq1vy4J9MeqDnXjokAOtvFcvbhe/g9wAENXFPaPxLAw1L4EULG",
    "roles" : [ 
        "USER"
    ],
    "uuid" : "55b62d4c-e5ee-408d-80c0-b79e02085b02",
    "disabled" : false,
    "company" : {
        "k" : "$id",
        "v" : ObjectId("59a92501df01110fbb6a5def")
    },
    "company_data" : [ 
        {
            "_id" : ObjectId("59a92501df01110fbb6a5def"),
            "name" : "Sample",
            "gln" : "3210987654321",
            "uuid" : "fee69ee4-b29c-483b-b40d-e702b50b0451",
            "creationDate" : ISODate("2017-09-01T09:14:41.562Z"),
            "lastChange" : ISODate("2017-09-01T09:14:41.562Z"),
            "version" : NumberLong(1),
            "disabled" : false
        }
    ]
}

As can hopefully be seen we only have the two users that contained a company reference and the two users now have also the complete company data in the response. Now additional filtering can be applied to get rid of the key/value helper and also to hide unwanted data.

The final query we came up with looks like this:

db.user.aggregate([
    { $project: { "userId": 1, "userKeyEncrypted": 1, "uuid":1, "roles": 1, "passwordHash": 1, "disabled": 1, company: { $objectToArray: "$$ROOT.company" }} }, 
    { $unwind: "$company" }, 
    { $match: { "company.k": "$id"}  }, 
    { $lookup: { from: "company", localField: "company.v", foreignField: "_id", as: "company_data" } },
    { $project: { "userId": 1, "userKeyEncrypted": 1, "uuid":1, "roles": 1, "passwordHash": 1, "disabled": 1,  "companyUuid": { $arrayElemAt: [ "$company_data.uuid", 0 ] } } }
])

Which finally returns our desired representation:

/* 1 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df0"),
    "userId" : "admin",
    "userKeyEncrypted" : {
        "salt" : "78e0528db239fd86",
        "encryptedAttribute" : "e4543ddac7cca9757721379e4e70567bb13956694f473b73f7723ac2e2fc5245"
    },
    "passwordHash" : "$2a$10$STRNORu9rcbq4qYUMld4G.HJk8QQQQBmAswSNC/4PBn2bih0BvjM6",
    "roles" : [ 
        "ADMIN"
    ],
    "uuid" : "b8aafdcf-d5c4-4040-a96d-8ab1a8608af8",
    "disabled" : false,
    "companyUuid" : "f1f86961-e8d5-40bb-9d3f-fdbcf549066e"
}

/* 2 */
{
    "_id" : ObjectId("59a92501df01110fbb6a5df1"),
    "userId" : "sample",
    "userKeyEncrypted" : {
        "salt" : "e3ac48695dea5f51",
        "encryptedAttribute" : "e804758b0fd13c219c3fc383eaa9267b70f7b8a1ed74f05575add713ce11804a"
    },
    "passwordHash" : "$2a$10$Gt2dq1vy4J9MeqDnXjokAOtvFcvbhe/g9wAENXFPaPxLAw1L4EULG",
    "roles" : [ 
        "USER"
    ],
    "uuid" : "55b62d4c-e5ee-408d-80c0-b79e02085b02",
    "disabled" : false,
    "companyUuid" : "fee69ee4-b29c-483b-b40d-e702b50b0451"
}

Some final note to this approach: This aggregation isn't very fast, sadly, but at least it gets the job done. I haven't tested it with an array of references as originally asked though this may require some additional unwindings probably.


Update: A further way of aggregating the data, which is more in line with the comments in the above mentioned bug report, can be seen below:

db.user.aggregate([
    { $project: { "userId": 1, "userKeyEncrypted": 1, "uuid":1, "roles": 1, "passwordHash": 1, "disabled": 1, companyRefs: { $let: { vars: { refParts: { $objectToArray: "$$ROOT.company" }}, in: "$$refParts.v" } } } },
    { $match: { "companyRefs": { $exists: true } } },
    { $project: { "userId": 1, "userKeyEncrypted": 1, "uuid":1, "roles": 1, "passwordHash": 1, "disabled": 1, "companyRef": { $arrayElemAt: [ "$companyRefs", 1 ] } } },
    { $lookup: { from: "company", localField: "companyRef", foreignField: "_id", as: "company_data" } },
    { $project: { "userId": 1, "userKeyEncrypted": 1, "uuid":1, "roles": 1, "passwordHash": 1, "disabled": 1,  "companyUuid": { $arrayElemAt: [ "$company_data.uuid", 0 ] } } }
])

Here the $let: { vars: ..., in: ... } operation copies the key and value of the reference into an own object and thus allows later on to lookup the reference via the corresponding operation.

Which of these aggregations performs better has yet to be profiled.

4 Comments

Is this better than using SQL?
@Christophe If you compare SQL with NoSQL databases, one thing that you might quickly notice is that SQL databases store the data usually in third normal form (3NF) which often leads to splitting the data up into multiple tables that need to be referenced and joined later on. In NoSQL databases on the contrary often all the data that is ever needed in included in the entry itself as such looking up references is rarely needed, though from time to time such a requirement may be present, as in my case i.e. Whether SQL may be better than NoSQL depends mostly on your needs.
@Christophe If you need lookups all of the time, either NoSQL databases may not be the best fit for your model or you should reiterate on the data model you want/need to store in your NoSQL collections. For rare lookups in other collections the current toolset should be fine, especially if your applications already heavily rely on the lose structure of entries. Exporting all of your data to SQL just to profit from better joins is also overkill
Hello Roman, sorry if my remark hurt you. This was not meant. I worked during 25 years with SQL and now only 1 year with mongo, so NoSQL. Honestly, I have hard times trying to do such basic requests in MongoDB that sometimes I am wondering why we do such things. I am working for an existing project, so it is barely possible to change to SQL. I have 6 entities connected to each others, so having everything in a single collection would be hardly usable. Anyway, I've found another way to retrieve the data I wanted.
1

Well.. you could query the Bar Model for the _id of all documents with testprop: true, then do a find $in and populate bars on the Foo Model with an array of those _id's you got from the first query.. :P

Maybe that counts as "In the Client" :P just a thought.

Comments

1

It wasn't possible before, but improvements from Mongo v3.4 we can get very close to it.

You can do it with mongo-join-query. Your code would look like this:

const mongoose = require("mongoose");
const joinQuery = require("mongo-join-query");

joinQuery(
    mongoose.models.Foo,
    {
        find: { "bars.testprop": { $in: [true] } },
        populate: ["bars"]
    },
    (err, res) => (err ? console.log("Error:", err) : console.log("Success:", res.results))
);

How does it work?

Behind the scenes mongo-join-query will use your Mongoose schema to determine which models to join and will create an aggregation pipeline that will perform the join and the query.

Disclosure: I wrote this library to tackle precisely this use case.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.