2

I am using puppeteer to do some testing.

No code written because I don't even know how to approach this.

• I have a list of 10 IDs inside an array

• For each ID -  a new page/tab is opened

• I want to run the script for each page/ tab without having to wait for the previous page/tab 
to finish before starting the next. Hence the simultaneous execution.

So 10 pages will be running the same script at the same time?

Is this possible with Javascript and puppeteer?

2 Answers 2

3

You might want to check out puppeteer-cluster (I'm the author of that library), which supports your use case. The library runs tasks in parallel, but also takes care of error handling, retrying and some other things.

You should also keep in mind that opening 10 pages for 10 URLs is quite costly in terms of CPU and memory. You can use puppeteer-cluster to use a pool of browsers or pages instead.

Code Sample

You can see a minimal example below. It's also possible to use the library in more complex settings.

const { Cluster } = require('puppeteer-cluster');

(async () => {
  const cluster = await Cluster.launch({
    concurrency: Cluster.CONCURRENCY_PAGE, // use one browser per worker
    maxConcurrency: 4, // Open up to four pages in parallel
  });

  // Define a task to be executed for your data, this function will be run for each URL
  await cluster.task(async ({ page, data: url }) => {
    await page.goto(url);
    // ...
  });

  // Queue URLs (you can of course read them from an array instead)
  cluster.queue('http://www.google.com/');
  cluster.queue('http://www.wikipedia.org/');
  // ...

  // Wait for cluster to idle and close it
  await cluster.idle();
  await cluster.close();
})();
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the library - i'll test it out tonight. Would headless in my case help with the CPU? @ThomasDondorf
@Mike No, that does not make a difference, but there are multiple tools to fine out what suits your machine.
2

Yes, it's default asynchronous behavior. You just need to open 10 tabs and run your script over these pages.

Here is the sample:

(async () => {
    const browser = await puppeteer.launch({
        headless: false
    });
    const ids = ['1', '2', '3'];
    const pool = [];

    for (let index = 0; index < ids.length; index++) {
        pool.push(
            browser.newPage() // create new page for each id
                .then(page => {
                    const currentId = ids[index];
                    // your script over current page
                })
        );
    }

    await Promise.all(pool); // wait until all 10 pages finished
    await browser.close(); // close the browser
})();

2 Comments

promise.all() will wait until all pages are resolved? Is it possible to have the pages tha resolve before others return their values? without having to wait until all are complete?
Nevermind i removed the close() and it worked as intended. close() it would close after just one iteration.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.