1

I have somewhat large data being stored under one document, it has some rough structure as following:

{
    "_id": "rPKzOqhVQfCwy2PzRqvyXA",
    "name": "test",
    "raw_data": [
        {},
        ...,
        {}
    ],
    "records": [
        {
            "_id": "xyz_1", // customer generated id
            ...other data
        },
        {
            "_id": "xyz_2", // customer generated id
            ...other data
        },
        {},
        {},
        ...
    ]
}

Now there can be 1000s of records in the document that I need to store from an imported file and each record will have it's own id (programmatically generated). The use case is, after saving this file user wants to do some processing on selected records only (i.e, with id xyz_1, xyz_2).

There are a lot of other data can be stored under this single document and I'm not interested to pull all of them while the above use case.

How do I query this document so that I can get the output such as the following:

[
    {
        "_id": "xyz_1", // customer generated id
        ...other data
    },
    {
        "_id": "xyz_2", // customer generated id
        ...other data
    }
]

2 Answers 2

1

You need to run $unwind and $replaceRoot:

db.collection.aggregate([
    { $unwind: "$records" },
    { $replaceRoot: { newRoot: "$records" } }
])
Sign up to request clarification or add additional context in comments.

6 Comments

@micki, I think this will search across documents stored so far in the collection and replace root in each of them, right?
@rsudip90 replace in query result. You can run $match if you need to filter out some of them
db.collection.aggregate([ { $unwind: "$records" }, { $replaceRoot: { newRoot: "$records" } }, { $match: { "_id": { $in: ["xyz_1", "xyz_2"] } } }, ]) this is what I did, but still it will search across all documents, right? Please have a look at my answer and tell me if there are any improvements required.
@rsudip90 add { $match: { "records._id": { $in: ["xyz_1", "xyz_2"] } } } as a first step - this will get single document, reshape it and then bring you only those two nested documents - better in terms of performance
@rsudip90 yes, that's true, that's why I mentioned this first $match, you can match it by _id or whatever else, just add it as a first step
|
0

as per @mickl's suggestion, my solution is to achieve the output as follows:

db.collection.aggregate([
    { $unwind: "$records" },
    { $replaceRoot: { newRoot: "$records" } },
    { $match: { "_id": { $in: ["xyz_1", "xyz_2"] } } },
])

Update

I think the above solution will go through each document and replace root in each of them, then do match query.

I wanted to search for records from one parent document only, not from all parent documents within the collection. My concern was it should not target other parent documents within a collection so I ended up with the solution, as follows:

db.collection.aggregate([
    { "$match": { "_id": parent_doc_id } },
    { "$unwind": "$records" },
    { "$match": { "records._id": { "$in": ["xyz_1", "xyz_2"] } } },
    { "$group": { "_id": "$_id", "records": { "$push": "$records" } } },
    { "$limit": 1 },
])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.