2

I have a collection of students that have a name and an array of email addresses. A student document looks something like this:

{
  "_id": {"$oid": "56d06bb6d9f75035956fa7ba"},
  "name": "John Doe",
  "emails": [
    {
      "label": "private",
      "value": "[email protected]"
    },
    {
      "label": "work",
      "value": "[email protected]"
    }
  ]
}

The label in the email subdocument is set to be unique per document, so there can't be two entries with the same label.

My problems is, that when updating a student document, I want to achieve the following:

  • adding an email with a new label should simply add a new subdocument with the given label and value to the array
  • if adding an email with a label that already exists, the value of the existing should be set to the data of the update

For example when updating with the following data:

{
  "_id": {"$oid": "56d06bb6d9f75035956fa7ba"},
  "emails": [
    {
      "label": "private",
      "value": "[email protected]"
    },
    {
      "label": "school",
      "value": "[email protected]"
    }
  ]
}

I would like the result of the emails array to be:

"emails": [
    {
      "label": "private",
      "value": "[email protected]"
    },
    {
      "label": "work",
      "value": "[email protected]"
    },
    {
      "label": "school",
      "value": "[email protected]"
    }
  ]

How can I achieve this in MongoDB (optionally using mongoose)? Is this at all possible or do I have to check the array myself in the application code?

3
  • The problem with $addToSet is that it will ignore the duplicates and don't replace the existing value when posting an email value with a label that already exists. However I want the subdocument to be replaced/updated if the label already exists. Commented Aug 24, 2016 at 14:31
  • Oops, jumped the gun here without reading the question fully. Yea, you are right it doesn't work in this case. Commented Aug 24, 2016 at 14:33
  • @chridam No problem. Any other ideas? Commented Aug 24, 2016 at 15:01

3 Answers 3

1

You could try this update but only efficient for small datasets:

mongo shell:

var data = {
    "_id": ObjectId("56d06bb6d9f75035956fa7ba"),
    "emails": [
        {
          "label": "private",
          "value": "[email protected]"
        },
        {
          "label": "school",
          "value": "[email protected]"
        }
    ]
};

data.emails.forEach(function(email) {
    var emails = db.students.findOne({_id: data._id}).emails,
        query = { "_id": data._id },
        update = {};

    emails.forEach(function(e) {
        if (e.label === email.label) {
            query["emails.label"] = email.label;
            update["$set"] = { "emails.$.value": email.value };
        } else {
           update["$addToSet"] = { "emails": email }; 
        }
        db.students.update(query, update)
    });
});
Sign up to request clarification or add additional context in comments.

1 Comment

Interesting approach. I'm thinking to just do the matching in the code though using lodash's unionBy() lodash.com/docs#unionBy - and then simply $set the new array. I think it will have better performance.
1

Suggestion: refactor your data to use the "label" as an actual field name.

There is one straightforward way in which MongoDB can guarantee unique values for a given email label - by making the label a single separate field in itself, in an email sub-document. Your data needs to exist in this structure:

{
  "_id": ObjectId("56d06bb6d9f75035956fa7ba"),
  "name": "John Doe",
  "emails": {
      "private": "[email protected]",
      "work" : "[email protected]"
  }
}

Now, when you want to update a student's emails you can do an update like this:

db.students.update(
    {"_id": ObjectId("56d06bb6d9f75035956fa7ba")},
    {$set: {
        "emails.private" : "[email protected]",
        "emails.school" : "[email protected]"
    }}
);

And that will change the data to this:

{
  "_id": ObjectId("56d06bb6d9f75035956fa7ba"),
  "name": "John Doe",
  "emails": {
      "private": "[email protected]",
      "work" : "[email protected]",
      "school" : "[email protected]"
  }
}

Admittedly there is a disadvantage to this approach: you will need to change the structure of the input data, from the emails being in an array of sub-documents to the emails being a single sub-document of single fields. But the advantage is that your data requirements are automatically met by the way that JSON objects work.

2 Comments

I thought about this too. The problem I see is that the user would not be able to define any label he wants. I would have to define the names of the fields in advance - at least as I am using mongoose. The only way I can think of is define the emails property of Schema type mixed (mongoosejs.com/docs/schematypes.html), but this would habe the disadvantage that I would not be able to run any validation on the data.
Just for future reference: Just found out that another way to achieve this is by defining a custom schema for emails and then set the strict:false option on that schema. This way you can define certain properties, but can still add new ones. However for the undefined properties the problem remains that there would be no validation. See: github.com/Automattic/mongoose/issues/2010
0

After investigating the different options posted, I decided to go with my own approach of doing the update manually in the code using lodash's unionBy() function. Using express and mongoose's findById() that basically looks like this:

Student.findById(req.params.id, function(err, student) {

    if(req.body.name) student.name = req.body.name;
    if(req.body.emails && req.body.emails.length > 0) {
        student.emails = _.unionBy(req.body.emails, student.emails, 'label');
    }

    student.save(function(err, result) {
        if(err) return next(err);
        res.status(200).json(result);
    });

});

This way I get the full flexibility of partial updates for all fields. Of course you could also use findByIdAndUpdate() or other options.

Alternate approach:

However the way of changing the schema like Vince Bowdren suggested, making label a single separate field in a email subdocument, is also a viable option. In the end it just depends on your personal preferences and if you need strict validation on your data or not.

If you are using mongoose like I do, you would have to define a separate schema like so:

var EmailSchema = new mongoose.Schema({
  work: { type: String, validate: validateEmail },
  private: { type: String, validate: validateEmail }
}, {
  strict: false,
  _id: false
});

In the schema you can define properties for the labels you already want to support and add validation. By setting the strict: false option, you would allow the user to also post emails with custom labels. Note however, that these would not be validated. You would have to apply the validation manually in your application similar to the way I did it in my approach above for the merging.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.