2

I have a collection which comprises of three level array nesting as shown below

_id: ObjectID('abc'),
sections: [
  {
    sectionId: "sec0",
    sectionName: "ABC",
    contents: [
      {
        contentId: 0,
        tasks: [
           {
             taskId: ObjectID('task1')
           }
           //May contain 1-100 tasks
        ],
        contentDescription: "Content is etc",
      }
    ]
  }
]

Sections is an array of objects which contains an object each with sectionId, and contents array which is an array of objects comprising of contentId, contentDescription, and nested array of tasks which comprises of an object containing a taskId. I am applying $lookup operator in order to join nested tasks array with tasks collection but I am facing a problem in document duplication as shown below.

_id: ObjectID('abc'),
sections: [
  {
    sectionId: "sec0",
    sectionName: "ABC",
    contents: [
      {
        contentId: 0,
        tasks: [
           {
             //Task Document of ID 1
           }
        ],
        contentDescription: "Content is etc",
      }
    ]
  }
]
_id: ObjectID('abc'),
sections: [
  {
    sectionId: "sec0",
    sectionName: "ABC",
    contents: [
      {
        contentId: 0,
        tasks: [
           {
             //Task Document of ID 2
           }
        ],
        contentDescription: "Content is etc",
      }
    ]
  }
]

Whereas the desired output is as follows

_id: ObjectID('abc'),
sections: [
  {
    sectionId: "sec0",
    sectionName: "ABC",
    contents: [
      {
        contentId: 0,
        tasks: [
           {
             //Task Document of ID 1
           },
           {
             //Task Document of ID 2
           },
           {
             //Task Document of ID 3
           }
        ],
        contentDescription: "Content is etc",
      }
    ]
  }
]

In the collection, a sections array might contain multiple section object which might contain multiple contents and so on and so forth. The schema in question is temporary as our company is currently migrating from an existing database to MongoDB, so architectural refactoring is not possible atm and I need to work with existing schema design from different database.

I tried the following way

const contents= await sections.aggregate([
    {
      $match: { _id: id},
    },
    { $unwind: '$sections' },
    {
      $unwind: {
        path: '$sections.contents',
        preserveNullAndEmptyArrays: true,
      },
    },
    {
      $unwind: {
        path: '$sections.contents.tasks',
        preserveNullAndEmptyArrays: true,
      },
    },
    {
      $lookup: {
        from: 'tasks',
        let: { task_id: '$sections.contents.tasks.taskId' },
        pipeline: [
          { $match: { $expr: { $eq: ['$_id', '$$task_id'] } } },
        ],
        as: 'sections.contents.tasks',
      },
    },
    {
      $addFields: {
        'sections.contents.tasks': {
          $arrayElemAt: ['$sections.contents.tasks', 0],
        },
      },
    },
    {
      $group: {
        _id: '$_id',
        exam: { $push: '$sections.contents.tasks' },
      },
    },
  ]);

And I am also unable to use $group aggregation operator like

$group: {
        _id: '$_id',
        sections: {
           sectionId : { $first: '$sectionId' },
           sectionName: { $first: '$sectionName' },
           contents: {
              contentId: { $first: '$contentId' },
              task: { $push: $sections.contents.tasks }
           }
         },
        },

Any help or directions will be appreciated, I also searched on SO, and found this but couldn't understand the following part

 {"$group":{
   "_id":{"_id":"$_id","mission_id":"$missions._id"},
   "agent":{"$first":"$agent"},
   "title":{"$first":"$missions.title"},
   "clients":{"$push":"$missions.clients"}
 }},
 {"$group":{
   "_id":"$_id._id",
   "missions":{
     "$push":{
       "_id":"$_id.mission_id",
       "title":"$title",
       "clients":"$clients"
      }
    }
 }}

1 Answer 1

0

So you're very close to the final solution, a good "rule" that's good to remember is if you unwind x times you need to group x to restore the original structure properly, like so:

db.collection.aggregate([
  {
    $match: {
      _id: id
    },
  },
  {
    $unwind: "$sections"
  },
  {
    $unwind: {
      path: "$sections.contents",
      preserveNullAndEmptyArrays: true,
    },
  },
  {
    $unwind: {
      path: "$sections.contents.tasks",
      preserveNullAndEmptyArrays: true,
    },
  },
  {
    $lookup: {
      from: "tasks",
      let: {
        task_id: "$sections.contents.tasks.taskId"
      },
      pipeline: [
        {
          $match: {
            $expr: {
              $eq: [
                "$_id",
                "$$task_id"
              ]
            }
          }
        },
        
      ],
      as: "sections.contents.tasks",
    },
  },
  {
    $addFields: {
      "sections.contents.tasks": {
        $arrayElemAt: [
          "$sections.contents.tasks",
          0
        ],
      },
    },
  },
  {
    $group: {
      _id: {
        contentId: "$sections.contents.contentId",
        sectionId: "$sections.sectionId",
        sectionName: "$sections.sectionName",
        originalId: "$_id"
      },
      tasks: {
        $push: "$sections.contents.tasks"
      },
      contentDescription: {
        $first: "$sections.contents.contentDescription"
      },
    }
  },
  {
    $group: {
      _id: {
        sectionId: "$_id.sectionId",
        sectionName: "$_id.sectionName",
        originalId: "$_id.originalId"
      },
      contents: {
        $push: {
          contentId: "$_id.contentId",
          tasks: "$tasks",
          contentDescription: "$contentDescription"
        }
      }
    }
  },
  {
    $group: {
      _id: "$_id.originalId",
      sections: {
        $push: {
          sectionId: "$_id.sectionId",
          sectionName: "$_id.sectionName",
          contents: "$contents"
        }
      }
    }
  }
])

Mongo Playground

However your pipeline could be made a little cleaner as it has 1 redundant $unwind stage that also adds a redundant $group stage. I won't post the entire fixed pipeline here as it's already a long post but feel free to check it out here: Mongo Playground fixed

Sign up to request clarification or add additional context in comments.

2 Comments

Thank-you so much, it works perfectly, if you don't mind me asking I was confused pertaining to _id within $group like in your answer originalId: "$_id" along with $sections.sectionId which was later used in second group stage. Is originalId: "$_id" in first group pointing towards document _id like _id: 'ABC'? Thanks in advance I also used your fixed query and it works like a charm, cheers
I just needed to save a reference of it to be used in the last group stage, it's not really needed at all actually functionality speaking. in theory it is needed only if your pipeline runs on more than 1 document, then i'm assuming there's a chance for identical section/content ids in more than 1 of the documents.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.