0

Note: Not a usual upsert case.

Present Document example:
{
   _id  : 123
   array:[
              {
                   id   : a
                   col1 : val1
                   col2 : val2  
              },
              {
                   id   : b
                   col1 : val1
                   col2 : val2  
              }
         ]
}


Update Document Sample:
{
     id   : a
     col1 : val1
     col2 : val2
     col3 : val3
}
  1. If document not present, create a new document
  2. If array not present, create array and add the document
  3. If array present but element with given id is absent, add the document as array element
  4. If element with given id present in array, and new document has an extra field, add that field
  5. If element with given id present in array, but no additional field in new document (old document may have extra field than new document), do nothing.

I am not able to figure out how to do this. I am using the update query as:

{
     _id      : 123
     array.id : a
}

That should take care of the first 3 points i think (update options have upsert: true )

How to ensure that only additional elements are getting appended. Upsert will make a duplicate as the documents are not exactly same. As i want to index those id, that operation would fail infact due to duplicate.

3
  • I see no data in the input to possibly match the _id value of 123, so how would you ever determine if the document already existed? Barring that, this is hardly a "new" question. Basic facts are that "upserts" and the "addition to arrays" do not mix well together. You basically need multiple update statements, failing in the final statement that the "upsert" is performed if the document does not exist at all. There are several answers here ( I know I have writen a few ) that you are not searching hard enough for. Bulk Updates are also the best option here. Commented Oct 8, 2015 at 10:27
  • 1. My update query contains _id :123, which determines if the document already exists 2. How will Bulk updates help when i dont know which of the fields already exist there? Commented Oct 8, 2015 at 10:30
  • Your question does not show that data being present in the input, and if you hardcode that then what is the point? I suggest reading about Bulk Operations since you clearly don't know what they are. You are new here. There is an edit link on your question that you should be using to correct and clarify. Commented Oct 8, 2015 at 10:50

1 Answer 1

1

The basic case here is that "upserts" and "additions to arrays" do not mix well in combination as a single statement. The very "basic" processing that you can do is using $addToSet, but this often does not suit the case, and particularly when you want to possibly "update" the contents of the array in question.

So this really always breaks down to "multiple" update statements in order to get the final desired state of data. The best case handling for "multiple" updates is to use the Bulk Operations API, since all operations can be sent in a "single" request with a "single" response.

That makes things more efficient in terms of handling since you are not needing to wait for the response of each operation from the server before trying the next one. It just takes some careful crafting of the operations so that there is no conflict between them when adding items.

Presuming you "actually" recieve an update packet like this:

{
     "_id"  : 123,
     "id"   : "a",
     "col1" : "val1",
     "col2" : "val2",
     "col3" : "val3"
}

Then your operations come out coded like this:

var bulk = db.collection.initializeOrderBulkOp();

// Try to set fields in matched array where found
bulk.find({ "_id": 123, "array.id": "a" }).updateOne({
    "$set": { 
        "array.$.col1": "val1",
        "array.$.col2": "val2",
        "array.$.col3": "val3"
    }
});

// Try to "push" fields to new array where array not found ( no upsert )
bulk.find({ "_id": 123, "array.id": { "$ne": "a" } }).updateOne({
    "$push": {
        "array": {
            "id": "a",
            "col1": "val1",
            "col2": "val2",
            "col3": "val3"
        }
    }
});

// Try to "upsert" where the basic document is not found.
// Only modify on "insert" via $setOnInsert
bulk.find({ "_id": 123 }).upsert().updateOne({
    "$setOnInsert": {
        "array": [
            {
                "id": "a",
                "col1": "val1",
                "col2": "val2",
                "col3": "val3"
            }
        ]
    }
});

// Actually send and execute the operations
bulk.execute();

This tackles the problem by implementing the logic that:

  1. Looks for the present array and updates accordingly. These are "positional" updates Adresses points 3 and 5.

  2. If the array is not present in a matching document then "push" a new array entry. So $ne means the array is not there. Note that this is not an upsert attempt, and deliberately so. Addresses points 2 and 4.

  3. If the document is not present at all, then add a new document with a brand new array. Addresses 1, and actually 2 as well.

It's multiple updates, but it is still just "one" request and response to the database server, so this is a good thing. MongoDB will also not even try to "replace" values that already exist with the same value so there is no need to worry about that, not that it really matters.

I would also suggest restructructing your input packet to be more reflective of what you want to do:

{
     "_id"  : 123,
     "array": [{
         "id"   : "a",
         "col1" : "val1",
         "col2" : "val2",
         "col3" : "val3"
     }]
}

As this is far easier to use programatically to construct the required statements to perform the update, as the structure is already present to reference from.

If you try to "mix" the "upsert" with such operations, then the required tests inevitably result in the creation of new documents where you did not mean to do so. So it is always the case that you need multiple operations in the type of updates you want to perform.

It is for that reason why the "upsert" is performed "last", and always with the $setOnInsert modifer so that any "matching" document is not actually modified at all unless the result is actually an "upsert", creating a brand new document. Again, not wise to mix other update modifiers in with this operation in a single statement.

This is one good reason why the "Bulk API" exists, so that "chained" logic like this can be put into a single server request, instead of multiple requests.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.