0

I have a document that looks like:

{
"_id": ObjectId(),
"employees": [
   {
        "_id": ObjectId(),
        "sharedBranches": [
            ObjectId(),
            ObjectId()
        ]
   },
{
        "_id": ObjectId()
   }
]
}

I am trying to return the documents that contains my input ObjectId in the sharedBranches field, and also filter the employees array down so it only contains objects whose sharedBranches contains my input ObjectId.

However, not every employee object (i.e. the elem in the employees array) contains the sharedBranches field. My query is returning an error which I am pretty sure is due to the Nulls, but I can't figure out the syntax for $isNull. here is my query. (note the branch_id is the input ObjectId I am searching on.

collection = client["collection"]["documents"]
pipeline = [
        {
            "$match": {
                "employees.sharedBranches": {"$elemMatch": {"$eq": ObjectId(branch_id)}},
            }
        },
        {
            "$project": {
                "employees": {
                    "$filter": {
                        "input": "$employees",
                        "as": "employees",
                        "cond": {"$in": [ObjectId(branch_id), {"$ifNull": ["$$employees.sharedBranches", []]}]}
                    }
                }
            }
        }
    ]

This query returns the error:

OperationFailure: $in requires an array as a second argument, found: object, full error: {'ok': 0.0, 'code': 40081, 'errmsg': '$in requires an array as a second argument, found: object', 'operationTime': Timestamp(1639079887, 1)}

It seems that the $ifNull stuff is not evaluating to an array. If I remove the $ifNull stuff, and just try to use $in on the array directly (so my $cond looks like: "cond": {"$in": [ObjectId(branch_id), "$$employees.sharedBranches"]},

I get this error:

OperationFailure: $in requires an array as a second argument, found: string, full error: {'ok': 0.0, 'code': 40081, 'errmsg': '$in requires an array as a second argument, found: string', 'operationTime': Timestamp(1639080588, 1)}

So I am at a loss of how to resolve this. Is my issue with the $ifNull? Am I mistaken that it's needed at all?

3
  • found: object - I'm pretty sure null is not an object. Have you tried checking for array with $type? Commented Dec 9, 2021 at 21:05
  • Weird. I duped your input data and created the same pipeline and it worked. The $ifNull expr properly turned the blank (missing) array into [] and the $filter worked fine. Commented Dec 9, 2021 at 22:04
  • Here's what I think happened: The thing that built the sharedBranches array created a single string val instead of an array of one. Commented Dec 9, 2021 at 22:22

1 Answer 1

1

I suspect some of your sharedBranches fields are not arrays but strings with a single ID. Here is a little trick that sniffs for the $type of such things and if the field is not an array (which includes if it is missing which will return missing), it turns it into an array of one:

c = db.foo.aggregate([
    {$project: {
    employees: {$filter: {
            input: "$employees",
            as: "employees",
            cond: {$in: [targetSharedBranchID, {$cond:
                        {if:{$ne:[{$type:'$$employees.sharedBranches'},"array"]},
                         then:  ['$$employees.sharedBranches'], // ah HA!  Create array of one on the fly.
                                                                // OK if missing; will create an empty array.
                         else: '$$employees.sharedBranches'
                        }} ] }
        }}
    }}

    ,{$match: {$expr: {$gt:[{$size:"$employees"},0]} }}

]);
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot @Buzz. i like this approach, and I am not familiar with mongo queries enough to figure that out for myself. Only issue, it doesn't work. i am still getting the "second element must be array" error. I played around with your code and tried replacing the then and else to return empty arrays (like this- then: [], else: []). When I do this, I am still getting the "second element must be array error". So it's leading me to believe it's not a data thing, but a query thing. Could there be a difference between the python api and the raw mongo api?
@Alan I just converted my javascript to python and reran it and it worked -- but of course this is with my data. I have employee arrays with sharedBranches as arrays, strings, and missing entirely. There's something going on in your data set that is not being handled by the if/then/else. Try: db.foo.aggregate([ {"$unwind": "$employees"} ,{"$match": {"$expr": {"$and":[ {"$ne":[{"$type":"$employees.sharedBranches"},"array"]}, {"$ne":[{"$type":"$employees.sharedBranches"},"missing"]} ]} }} ]) to discover what is going on.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.