1

I am parsing multiple large JSON files to my mongoDB database. At the moment I am using stream-json npm package. After I load one file I change the filename that I am loading and relaunch the script to load the next file. This is unnecessarily time consuming. So how can I iterate through all the files automatically? At the moment my code looks like this:

const StreamArray = require('stream-json/utils/StreamArray');
const path = require('path');
const fs = require('fs');

const filename = path.join(__dirname, './data/xa0.json'); //The next file is named xa1.json and so on.

const stream = StreamArray.make();

stream.output.on('data', function (object) {
    // my function block
});

stream.output.on('end', function () {
    console.log('File Complete');
});

fs.createReadStream(filename).pipe(stream.input);

I tried iterating through the names of the files by adding a loop that would add +1 to the filename i.e. xa0 to xa1 at the same point where the script console.log('File Complete') but this didn't work. Any ideas how I might be able to achieve this or something similar.

1 Answer 1

1

Just scan your JSON files directory using fs.readdir. It will return a list of file names that you can then iterate, something like this :

fs.readdir("./jsonfiles", async (err, files) => {
    for( file in files ){
      await saveToMongo("./jsonfiles/" + file)
    }
})

So you just launch your script once and wait until full completion.

Of course, in order for it to be awaited, you need to promisify the saveToMongo function, something like :

const saveToMongo = fileName => {

    return new Promise( (resolve, reject) => {

        // ... logic here

        stream.output.on('end', function () {
            console.log('File Complete');
            resolve() // Will trigger the next await
        });
    })
}
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your help. Whilst it is iterating through the file names without any issues the stream within the saveToMongo block isn't launching/executing. Therefore all the file names are simply iterated through in a second. Within the saveToMongo function block I have got the same code as from const stream = downwards in the above example. I also know the filename in the fs.readCreateReadStream is available within the block and the same as before i added the iterator. Any ideas why the stream is not executing? Thanks so much for your help
Yes, I have amended my answer and added stuff about Promises
Thanks @Jeremy Thille !

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.