MongoDB - running queries from a file in NodeJS

Question

I am working on a project, comparing times of processing queries with and without caching on different DB systems. Now I am using nodeJS and mongodb.

I have a txt file with queries (or more like conditions) that looks like this:

{'c1' : { $regex: /^A/ }, 'c2' : 'something', 'c3' : '8'}
{'c1' : { $regex: /^B/ }, 'c2' : 'somethingElse', 'c3' : '12'}
{'c1' : { $regex: /^C/ }, 'c2' : 'somethingDifferent', 'c3' : '16'}
...

And I need to read all these strings/objects from the file, make a query from each one of them and run them on the database (and measure the time it takes to finish all of these queries).

So my idea is to read the file line by line using a lineReader and convert the line to a query immediately, e.g.:

var lineReader = require('readline').createInterface({
    input: require('fs').createReadStream('file.txt')
});

lineReader.on('line', function (line) {
    query = line;
    //I only get this output
    console.log(query);

    query = JSON.stringify(query);
    dbo.collection('myCol').find(JSON.parse(query)).toArray(function(err, result) {
        //This code is never reached
        if (err) throw err;
        console.log(result);
    });
}

db.close();

But this approach is wrong, because I never get any result from find(query).toArray() and the program crashes with

MongoError: pool destroyed

every time.

I tried several different solutions, but I always ended up with this error or with MongoNetworkError: connection destroyed, not possible to instantiate cursor, or even the process running out of memory.

EDIT: JSON parsing

You're using line reader and reading text file line by line so for each line, you'll get STRING of object and MongoDB's find function takes object not string having object — Abhay Sehgal
– Abhay Sehgal, Commented Oct 19, 2019 at 17:44
Thanks for your advice. I changed my code to parse JSON from the string, however the problem remains. No results are printed and I still get MongoError: pool destroyed error. — P. Paul
– P. Paul, Commented Oct 19, 2019 at 20:38

Shivam · Accepted Answer · 2019-10-19 22:04:25Z

1

MongoNetworkError: connection destroyed error is only thrown by MongoDB in case of untimely closing of connection, assuming that the file is really huge, MongoDB queries are async in nature so, connection gets interrupted before query finishes. Try removing db.close() (for testing, development purpose) and find a way to close your connection safely after query is ran. (before production release)

EDIT

Run your server with this flag --max-old-space-size=4096

node --max-old-space-size=4096 yourFile.js

Your Server is utilizing memory more than default defined by v8 engine which is around 1.7 GB.

edited Oct 19, 2019 at 22:04

answered Oct 19, 2019 at 20:52

Shivam

3,6622 gold badges15 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

P. Paul Over a year ago

When I remove db.close(), my program freezes, still not getting any results from any of my queries. Then after a few seconds, I get FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory. Could it be a sign of something being wrong with my queries execution? (I can post the whole stack trace if needed).

Shivam Over a year ago

Check out the edit and Can you tell me the size of your file?

P. Paul Over a year ago

Running my server with this flag caused the proccess to utilize more and more memory, up to about 5 GB, also utilizing my PC resources to the maximum, so I killed it. It didn't crash by it self though, like it did before. My file size (the one with queries) is 64 kB - 500 rows total. I set it up just for testing, I would like to use a file 10 times this size in future.

Yuri Khomyakov · Accepted Answer · 2019-10-19 23:21:26Z

When you run lineReader like that, you have no backpressure, meaning it will read all lines and fire everything to mongo.

Three options that come to mind are:

You want backpressure, to control the data passing operations to mongo. For example, using Async Iterators:

const fs = require('fs');
const readline = require('readline');
const stream = require('stream');

function readLines({ input }) {
  const output = new stream.PassThrough({ objectMode: true });
  const rl = readline.createInterface({ input });
  rl.on("line", line => { 
    output.write(line);
  });
  rl.on("close", () => {
    output.push(null);
  }); 
  return output;
}
const input = fs.createReadStream("./pathToFile");
(async () => {
  for await (const line of readLines({ input })) {
    // mongo interaction that await each operation here
  }
})();

Accumulate and send a batch to mongo, either with $in operator like

{ $regex: $in:[/^A/, /^B/, /^C/] }
...

Accumulate each promise and when finished sending lines, use Bluebirds promise map to control concurrency. For example:

var lineReader = require('readline').createInterface({
    input: require('fs').createReadStream('file.txt')
});
var bluebird = require('bluebird');

var promises = [];

lineReader.on('line', function (query) {
    promises.push(query)
}

bluebird
.map(promises, 
  query => dbo.collection('myCol').find(query),
  { concurrency: 50 })
.then(...)
.catch(...)

db.close();

Collectives™ on Stack Overflow

MongoDB - running queries from a file in NodeJS

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related