3

In my MongoDB collection, I have list of accounts, with role for each account, the structure has format as following

{
    "_id" : "acc1",
    "email" : "[email protected]",
    "password" : "password",
    "roles" : [ 
        "ADMIN", 
        "USER"
    ],
},
{
    "_id" : "acc2",
    "email" : "[email protected]",
    "password" : "password",
    "roles" : [ 
        "USER"
    ],
},
{
    "_id" : "acc2",
    "email" : "[email protected]",
    "password" : "password",
    "roles" : [ 
        "ADMIN",
        "SYSTEM",
        "USER",
    ],
}

Now, all roles I would like to add the ROLE_ as prefix, then the JSON should be

{
    "_id" : "acc2",
    "email" : "[email protected]",
    "password" : "password",
    "roles" : [ 
        "ROLE_ADMIN",
        "ROLE_SYSTEM",
        "ROLE_USER",
    ]
}

I don't know how to implement the MongoDB script to transform for whole documents, to append the prefix to an element like that.

2 Answers 2

4

You can use cursor.forEach() to iterate over collection and update every document. It's very simple, but slow and shouldn't be used on large collections.

db.users.find().forEach(function (doc) {
    var newRoles = doc.roles.map(function (value) {
        return "ROLE_" + value;
    });
    db.users.update(
        {_id: doc._id}, 
        {$set: {roles: newRoles}}
    );
});

Measured execution time using MongoDB 3.2 on collection with 50k documents and here are results:

  • this approach: 17.244s
  • user3100115's approach: 2.181s

Obvious conclusion is to use this simple approach only on small collections and stick with bulk approach for large collections.

Sign up to request clarification or add additional context in comments.

1 Comment

@KhoiNguyen This is bad because if you have 50K documents you will hit the database 50K times. Very inefficient.
3

The best way to do this is using the .aggregate() method which provides access to the aggregation pipeline.

In your pipeline you only need one stage which is the $project where you use the $map which returns an array of the concatenated string. Of course the $concat operator concatenates strings and returns the concatenated string.

You then iterate your aggregation result which is a cursor and update your documents using "bulk" operations for maximum efficiency.

var bulkOp = db.users.initializeOrderedBulkOp();
var count  = 0;

db.users.aggregate([
    { "$project": { 
        "roles": { 
            "$map": { 
                "input": "$roles", 
                "as": "role", 
                "in": { "$concat": [ "ROLE_", "$$role" ] } 
            } 
        } 
    }}
]).forEach(function(doc) {
    bulkOp.find( { "_id": doc._id } ).updateOne(
        { "$set": { "roles": doc.roles } }
    );
    count++;
    if (count % 300 === 0) {
        // Execute per 300 operations and re-init
        bulkOp.execute();
        bulkOp = db.users.initializeOrderedBulkOp();
    }
})

// Clean up queues

if (count > 0)
    bulkOp.execute();

MongoDB 3.2 deprecates Bulk() and its associated methods and provides the .bulkWrite() method.

var requests = [];

db.users.aggregate([
    { "$project": { 
        "roles": { 
            "$map": { 
                "input": "$roles", 
                "as": "role", 
                "in": { "$concat": [ "ROLE_", "$$role" ] } 
            } 
        } 
    }}
]).forEach( document => {
    requests.push( 
        { "updateOne": 
            { 
                "filter": { "_id": doc._id },
                "update": { "$set": { "roles": doc.roles } }
            }
        }
    );
    if (requests.length === 1000) {
       // Execute per 1000 operations
        db.users.bulkWrite(requests);
        requests = [];
    }
});

db.users.bulkWrite(requests);

Your documents then look like this:

{
        "_id" : "acc1",
        "email" : "[email protected]",
        "password" : "password",
        "roles" : [
                "ROLE_ADMIN",
                "ROLE_USER"
        ]
}
{
        "_id" : "acc2",
        "email" : "[email protected]",
        "password" : "password",
        "roles" : [
                "ROLE_USER"
        ]
}
{
        "_id" : "acc3",
        "email" : "[email protected]",
        "password" : "password",
        "roles" : [
                "ROLE_ADMIN",
                "ROLE_SYSTEM",
                "ROLE_USER"
        ]
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.