4

I have a route on an express server that has to call an external API that sends back a list of files on that server. Afterwards you call another one of its APIs to get the contents of each file. Once I have that, I write the contents of each file to a new file in my project's root directory.

That's all fine and works good. The problem is when I do it with more than one user. This request takes about 3 minutes to complete, and if it's just one instance of my app calling the route then it works fine every time. But if I open another instance, log in with another user and start the same request at the same, I run into issues.

It's not a timeout issue, although I have dealt with that while working on this and already found ways around that. This definitely has to do with multiple users hitting the route at once.

Sometimes it doesn't complete at all, sometimes it quickly throws an error for both users, and sometimes just one will fail while the other's completes.

I've been looking around, and I suspect that I'm blocking the event loop and need to use something like worker threads. My question is am I on the right track with that or is it something else I don't know?

The code basically looks like this:

//this whole request takes about 3 minutes to complete if successful due to rate limiting of the external APIs.
//it's hard to imagine why I would want to do this kind of thing, but it's not so important.. what is really important
//is why I get issues with more than 1 user hitting the route.
router.get('/api/myroute', (req, res, next) => {

    //contact a remote server's API, it sends back a big list of files.
    REMOTE_SERVER.file_list.list(USER_CREDS.id).then(files => {

        //we need to get the contents of each specific file, so we do that here.
        Promise.all(files.map((item, i) =>
            //they have an API for specific files, but you need the list of those files first like we retrieved above.
            REMOTE_SERVER.specific_file.get(USER_CREDS.id, {
                file: { key: files[i].key }
            }).then(asset => {

                //write the contents of each file to a directory called "my_files" in the project root.
                fs.writeFile('./my_files/' + file.key, file.value, function (err) {
                    if (err) {
                        console.log(err);
                    };
                });
            })))
            .then(() => {
                console.log("DONE!!");
                res.status(200).send();
            })
    });
});
3
  • 1
    I suspect there are synchronicity issues involved. Ie two users writing/reading same file Commented May 5, 2019 at 18:05
  • You can disable this api while processing. Commented May 5, 2019 at 18:09
  • @NikosM no all files involved are different on my server and the api are different... however they are being written to the same directory in my project folder. Different files, just inside the same folder. Commented May 5, 2019 at 18:23

2 Answers 2

4

You've met the default limits of Node's async I/O! Long story short, for fs module Node.js makes use of libuv thread pool, which size is equal to 4 by default. For some things Node delegates its job to the underlying operating system async handlers (epoll, kqueue, etc.), but for stuff like DNS, crypto or, in our case, file system, it uses libuv. Most likely the amount of files you want to write to disk is bigger than 4. Most likely it becomes even bigger when a parallel request comes in. At the end of the day you're simply running out of libuv threads and then Node simply has nothing to do, bu to wait until at least one thread is free to use. It really depends on the amount of files, therefore your app's behavior is not stable.

What you can do, is you can increase the size of the thread pool by passing an environment variable UV_THREADPOOL_SIZE with a number value bigger than 4. But it's still very limited. Node.js event loop model is not the best choice for such things, to be honest. Also think about cases when different requests write files with the same names. If you're ok with the "last write wins" concurrency model, than it might be ok fr you, but your files might end up being corrupted due to wrong order of operations. That's a pretty tough task to solve.

For more details about libuv and those fancy thread pools, I recommend you to watch this pretty good talk.

Actually Node's official docs on fs warn you about such behavior.

Sign up to request clarification or add additional context in comments.

4 Comments

I see, wow it really does say it very plainly in the docs. Thank you for pointing that out.
Yeah, you were right, you actually end up locking the event loop. Node's non-blocking I/O is actually an over simplification. :)
and what about using spawning a new child process every time the route is called and the file transfer and writing begins? Can you see an issue with that sort of solution?
It might work for you, but keep in mind that child processes of Node are not created using fork syscall. To spawn a child process Node creates a new V8 instance, for each, which can hit performance a lot too. Run performance tests. If you host your app on AWS, I'd recommend you to scale on multiple instances and make use of shared file system with EFS.
0
router.get('/api/myroute', (req, res, next) => {

    //Check this api is processing
    if (global.isLocked_ApiMyroute) {
        res.status(200).send('Please try again after a few minutes');
        return;
    }

    //contact a remote server's API, it sends back a big list of files.

    //lock this api while processing
    global.isLocked_ApiMyroute = true;

    REMOTE_SERVER.file_list.list(USER_CREDS.id).then(files => {

        //we need to get the contents of each specific file, so we do that here.
        Promise.all(  ... )
            .then(() => {
                console.log("DONE!!");
                res.status(200).send();
                global.isLocked_ApiMyroute = false;
            })
            .catch(() => { // added catch block : because of [anycase, isLocked_ApiMyroute must be false]
                global.isLocked_ApiMyroute = false;
            })
    });
});

Of course, this answer is not good solution,
But with short work, we can solve via node js global for lock this api.


Another Tips If there is problem with file write with same file name,
We can solve with

  1. Write tempfilename
  2. rename temfilename to correct filename

But, if problem with same file reading(Problem from third party API), lock is more stable.


Also, please add catch(error=>console.log(error); with then
This can be find the problem comes from where

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.