1

I am developing a piece of code for massive import of items into database. The total records in the Excel file are 24103. And the total time of 40 minutes it was taking initially to import 700 lines of record because of massive validations. Now after reducing the strategical steps the time has reduced to 24 minutes for all records. Now there are 4 activities which could execute parallely independent of each other. Those activities i tried executing parallely by async.parallel process. But when i saw the console for debugging, i came to know that it still execute those all in one record at a time way. Just the difference is in shuffling of records which are getting inserted. And after all the execution, it took the same time that it took before when the activities executing sequentially. What could be another possible way to make execution(Insert) in parallel way.

My steps include

        flowController.on('START', function () {
          // Uploading and storing the binary data as excel file 
        });

        flowController.on('1', function () {
          // Validation to check if file contain required sheet with name and all and convert the file into JSON format
        });

        flowController.on('2', function () {
          // Replacing the keys from the JSON to match the required keys 
        });

        flowController.on('3', function () {
          // Massive validation of records & seperating the records as per the activity
        });

        // Activity 1  
        flowController.on('4', function () {
          // Insert records of Activity 1  
        });

        // Activity 2  
        flowController.on('5', function () {
          // Insert records of Activity 2
        });

        // Activity 3  
        flowController.on('6', function () {
          // Insert records of Activity 3
        });

        // Activity 3  
        flowController.on('END', function () {
          // End
        });
4
  • If those are sync activities, async parallel won't help Commented Sep 11, 2017 at 6:13
  • Perhaps you should show your code. If it's taking you 40 minutes to run 700 lines then you are doing something drastically wrong. Bottom line is you are asking "can I do it?" without actually showing us what you are doing. Too broad as a general question. Commented Sep 11, 2017 at 6:15
  • @SuhailGupta I didn't get you.... Commented Sep 11, 2017 at 6:18
  • @NeilLunn That was probably my first line of statement. Further i mentioned that i reduced the validation steps strategically. The time i mentioned includes validation and insert operation both. For you i will update the question for better understanding of the context Commented Sep 11, 2017 at 6:20

1 Answer 1

1

Apparently, the logic in your applications takes longer than a few seconds to execute. You could consider moving it off the main thread, especially if it will also be running many times per hour or day.

Because Node is single threaded, long running processes can block other code from executing and give end users the perception that your application slow.

You can spin a child process using Node’s child_process module and those child processes can easily communicate with each other with a messaging system. You could spin multiple processes at the same time and get them executed in parallel.

Your first move the whole long-computation function into its own file and make it invoke that function when instructed via a message from the main process. Now, instead of doing the long operation in the main process event loop, you can fork the long-computation.js file and use the messages interface to communicate messages between the server and the forked process.

When a request to long-computation happens now with the above code, we simply send a message to the forked process to start executing the long operation. The main process’s event loop will not be blocked.

Once the forked process is done with that long operation, it can send its result back to the parent process using process.send

References that might help you:

Update:

Here is a sample code that has two functions with for loop that goes to a million. Now just by wrapping the two in parallel-wrapper (async.parallel in your case) doesn't mean they will magically become parallel operations. They must be async operations.

const Promise = require('bluebird');

function one() {
    return new Promise((res, rej) => {
        for (let i = 0; i <= 1000000; i++) {
            console.log('one');
            if (i == 1000000) res('Done')
        }
    });
}

function two() {
    return new Promise((res, rej) => {
        for (let i = 0; i <= 1000000; i++) {
            console.log('two');
            if (i == 1000000) res('Done')
        }
    });
}


let a = [one(), two()];

Promise.all(a).then(r => console.log(r));

Function two will always be executed after one

Sign up to request clarification or add additional context in comments.

2 Comments

Basically its one time actvitiy but number of records may vary and may be in lakhs. My biggest concern is the time.
@Shaggie Since each operation is a sync operation, they can never execute in parallel. If there is some way you could delegate the work to a different process/thread, it will help reduce the overall time.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.