3

My actors collection contains an array-of-documents field, called acted_in. Instead of returning the size of acted_in.idmovies like so: {$size: $acted_in.idmovies}, I want to return the number of distinct values inside $acted_in.idmovies. How can I do that ?

c1 = actors.aggregate([{"$match": {'$and': [{'fname': f_name},
                                            {'lname': l_name}]}},
                       {"$project": {'first_name': '$fname',
                                     'last_name': '$lname',
                                     'gender': '$gender',
                                     'distinct_movies_played_in': {'$size': '$acted_in.idmovies'}}}])
1
  • 1
    You can use $setDifference with [] which will remove the duplicates. Something like distinct_movies_played_in': {'$size': { $setDifference: [ '$acted_in.idmovies', []] }} Commented Jun 16, 2017 at 20:16

2 Answers 2

8

You basically need to include $setDifference in there to obtain the "distinct" items. All "sets" are "distinct" by design and by obtaining the "difference" from the present array to an empty one [] you get the desired result. Then you can apply the $size.

You also have some common mistakes/misconceptions. Firstly when using $match or any MongoDB query expression you do not need to use $and unless there is an explicit case to do so. All query expression arguments are "already" AND conditions unless explicitly stated otherwise, as with $or. So don't explicitly use for this case.

Secondly your $project was using the explicit field path variables for every field. You do not need to do that just to return the field, and outside of usage in an "expression", you can simply use a 1 to notate you want it included:

c1  = actors.aggregate([
 { "$match": { "fname"': f_name, "lname": l_name } },
 { "$project": {
   "first_name": 1,
   "last_name": 1,
   "gender": 1,
   "distinct_movies_played_in": { 
     "$size": { "$setDifference": [ "$acted_in.idmovies", [] ] }
   } 
  }}
])

In fact, if you are actually using MongoDB 3.4 or greater ( and your notation of an element within an array "$acted_in.idmovies" says you have at least MongoDB 3.2 ) which has support for $addFields then use that instead of specifying all other fields in the document.

c1  = actors.aggregate([
 { "$match": { "fname"': f_name, "lname": l_name } },
 { "$addFields": {
   "distinct_movies_played_in": { 
     "$size": { "$setDifference": [ "$acted_in.idmovies", [] ] }
   } 
  }}
])

Unless you explicitly need to just specify "some" other fields.

The basic case here is do not use $unwind for array operations unless you specifically need to perform a $group operation on with it's _id key pointing at a value obtained from "within" the array.

In all other cases, MongoDB has far more efficient operators for working with arrays that what $unwind does.

Sign up to request clarification or add additional context in comments.

2 Comments

Useful insight about the inefficiency of the $unwind operator. Your solution would indeed be more efficient than mine!
Thank you for: $countDistinct = $size: { $setDifference: ['$myArray', []] }
2

This should give you what you want:

actors.aggregate([
    {
        $match: {fname: f_name, lname: l_name}
    }, 
    {
        $unwind: '$tags'
    }, 
    {
        $group: {
                    _id: '$_id', 
                    first_name: {$first: '$fname'}, 
                    last_name: {$last: '$lname'}, 
                    gender: {$first: '$gender'}, 
                    tags: {$addToSet: '$tags'}
                }
    }, 
    {
        $project: {
                      first_name: 1, 
                      last_name: 1, 
                      gender: 1, 
                      distinct: {$size: '$tags'}
                  }
    }
])

After the tags array is deconstructed and then put back into a set of itself, then you just need to get the number of items or length of that set.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.