0

I have a problem with developing a crawler using nodejs/puppeteer. The old crawler was:

  1. Crawl Pages
  2. Store the output file locally with the fs module

Since i'm going to introduce UI on the server, have set up the scenario to upload it to S3 instead of storing it locally, and show the result as a UI.

  1. Crawl Pages
  2. Stream output files to the server with the fs module
  3. Get the output file back and upload it to the S3 bucket

The above is a scenario that i know of as a knowledge, and i'd like to know if it is possible as below.

  1. Crawl Pages
  2. Upload the stream stored data to memory to the S3 bucket

If you have a scenario like this, I would like to receive a guide. I would really appreciate if you would comment or reply :)

1 Answer 1

0

This is definitely possible. If you just pipe from your input stream to your server and pipe up to the S3 it should complete the loop.

This is possible because you can stream uploads to S3, even without knowing the size of the file beforehand.

This answer should help you out: S3 file upload stream using node js

If you post some code we can answer this a little bit better. But hope this puts you in the right direction.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.