22

I was wondering what is the correct way to do bulk inserts into Mongodb (although could be any other database) with Node.js

I have written the following code as an example, although I believe it is floored as db.close() may be run before all the asynchronous collection.insert calls have completed.

MongoClient.connect('mongodb://127.0.0.1:27017/test', function (err, db) {
    var i, collection;
    if (err) {
        throw err;
    }
    collection = db.collection('entries');
    for (i = 0; i < entries.length; i++) {
        collection.insert(entries[i].entry);
    }
    db.close();
});

4 Answers 4

26

If your MongoDB server is 2.6 or newer, it would be better to take advantage of using a write commands Bulk API that allow for the execution of bulk insert operations which are simply abstractions on top of the server to make it easy to build bulk operations and thus get perfomance gains with your update over large collections.

Sending the bulk insert operations in batches results in less traffic to the server and thus performs efficient wire transactions by not sending everything all in individual statements, but rather breaking up into manageable chunks for server commitment. There is also less time waiting for the response in the callback with this approach.

These bulk operations come mainly in two flavours:

  • Ordered bulk operations. These operations execute all the operation in order and error out on the first write error.
  • Unordered bulk operations. These operations execute all the operations in parallel and aggregates up all the errors. Unordered bulk operations do not guarantee order of execution.

Note, for older servers than 2.6 the API will downconvert the operations. However it's not possible to downconvert 100% so there might be some edge cases where it cannot correctly report the right numbers.

In your case, you could implement the Bulk API insert operation in batches of 1000 like this:

For MongoDB 3.2+ using bulkWrite

var MongoClient = require('mongodb').MongoClient;
var url = 'mongodb://localhost:27017/test';
var entries = [ ... ] // a huge array containing the entry objects

var createNewEntries = function(db, entries, callback) {

    // Get the collection and bulk api artefacts
    var collection = db.collection('entries'),          
        bulkUpdateOps = [];    

    entries.forEach(function(doc) {
        bulkUpdateOps.push({ "insertOne": { "document": doc } });

        if (bulkUpdateOps.length === 1000) {
            collection.bulkWrite(bulkUpdateOps).then(function(r) {
                // do something with result
            });
            bulkUpdateOps = [];
        }
    })

    if (bulkUpdateOps.length > 0) {
        collection.bulkWrite(bulkUpdateOps).then(function(r) {
            // do something with result
        });
    }
};

For MongoDB <3.2

var MongoClient = require('mongodb').MongoClient;
var url = 'mongodb://localhost:27017/test';
var entries = [ ... ] // a huge array containing the entry objects

var createNewEntries = function(db, entries, callback) {

    // Get the collection and bulk api artefacts
    var collection = db.collection('entries'),          
        bulk = collection.initializeOrderedBulkOp(), // Initialize the Ordered Batch
        counter = 0;    

    // Execute the forEach method, triggers for each entry in the array
    entries.forEach(function(obj) {         

        bulk.insert(obj);           
        counter++;

        if (counter % 1000 == 0 ) {
            // Execute the operation
            bulk.execute(function(err, result) {  
                // re-initialise batch operation           
                bulk = collection.initializeOrderedBulkOp();
                callback();
            });
        }
    });             

    if (counter % 1000 != 0 ){
        bulk.execute(function(err, result) {
            // do something with result 
            callback();             
        }); 
    } 
};

Call the createNewEntries() function.

MongoClient.connect(url, function(err, db) {
    createNewEntries(db, entries, function() {
        db.close();
    });
});
Sign up to request clarification or add additional context in comments.

6 Comments

How would you close the db if counter % 1000 == 0
You may have to add the db.close(); statement after the if (counter % 1000 != 0 ){ ... } statement block to close the db after all.
Do you not then risk calling db.close() whilst the bulk.execute calls from the forEach are still running?
You won't because here bulk.execute() is a mongodb write operation and its an asynchronous IO call. This allows node.js to proceed with the event loop before bulk.execute() is done with its db writes and calls back. I've updated my answer with this callback approach.
This is so much more convenient and it works beautifully! Thanks for this answer.
|
11

You can use insertMany. It accepts an array of objects. Check the API.

3 Comments

That is ok if you have a small number of records to insert, but what if you had many thousands?
In bulk operation mongoDB(3.x) takes 1000 document batch in single group and for more documents it create groups and executes it. Please refer docs.mongodb.com/v3.2/reference/method/db.collection.insertMany/…
Another place in the documentation where the earlier comment is re-iterated regarding insertMany: PyMongo will automatically split the batch into smaller sub-batches based on the maximum message size accepted by MongoDB, supporting very large bulk insert operations.
2

New in version 3.2.

The db.collection.bulkWrite() method provides the ability to perform bulk insert, update, and remove operations. MongoDB also supports bulk insert through the db.collection.insertMany().

In bulkWrite it is supporting only insertOne, updateOne, updateMany, replaceOne, deleteOne, deleteMany

In your case to insert data using single line of code, it can use insertMany option.

 MongoClient.connect('mongodb://127.0.0.1:27017/test', function (err, db) {
            var i, collection;
            if (err) {
                throw err;
            }
            collection = db.collection('entries');
            collection.insertMany(entries)
            db.close();
        });

Comments

1
var MongoClient = require('mongodb').MongoClient;
var url = 'mongodb://localhost:27017/test';
var data1={
    name:'Data1',
    work:'student',
    No:4355453,
    Date_of_birth:new Date(1996,10,17)
};

var data2={
    name:'Data2',
    work:'student',
    No:4355453,
    Date_of_birth:new Date(1996,10,17)
};

MongoClient.connect(url, function(err, db) {
    if(err!=null){
        return console.log(err.message)
    }

    //insertOne
    db.collection("App").insertOne(data1,function (err,data) {

        if(err!=null){
            return console.log(err);
        }
        console.log(data.ops[0]);
    });

    //insertMany

var Data=[data1,data2];

db.collection("App").insertMany(Data,forceServerObjectId=true,function (err,data) {

        if(err!=null){
            return console.log(err);
        }
        console.log(data.ops);
    });
    db.close();
});

3 Comments

While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value.
This code only for insert many record or a single record example
"forceServerObjectId=true" worked for Duplicate Key Error id, Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.