1

I'm trying to implement an async on each loop on nodejs.

I have a variable html which contains the page content. There I want to iterate through all divs that have a particular class. Inside those divs, there are some links that I want to navigate and get some content from them too. So basically since each expects synchronous function it doesn't wait for the other code to be executed.

I tried to do it like this:

const browser = await puppeteer.launch({
    headless: true
});
const page = await browser.newPage();
const page2 = await browser.newPage();
const mainUrl = "http ... ";

const html = await page.goto(mainUrl)
    .then(function() {
        return page.content();
    });

await $('.data-row', html).each(function() => {
    const url = await $(this).find(".link-details a").attr("href");
    page2.goto(url)
        .then(function() {
            const title = await page.evaluate(el => el.innerHTML, await page.$('#title'));
            // do other things 
        });
    // do other things 
    // create a json with data add it to a list  

});

But the title gives undefined and it's executed after the loop finishes executing ... What can I do here?

2
  • Are you inside of an async closure? Commented May 22, 2019 at 0:46
  • You have mixed await and then through all of your code. You can not await a jQuery $().each. Commented May 22, 2019 at 1:01

2 Answers 2

1

I've edited your code to show how Puppeteer was supposed to be used. Your main problem here was using jQuery where it was not needed and attempting to await things that were not asynchronous; while mixing in a promise chain.

(async () => {

  const browser = await puppeteer.launch({
      headless: true
  });
  const page = await browser.newPage();
  const page2 = await browser.newPage();
  const mainUrl = "http ... ";

  /*const html = await page.goto(mainUrl)
    .then(function() {
        return page.content();
    });*/
  
  await (page.goto(mainUrl))
  await page.waitForSelector('.data-row');
  const dataRows = await page.evaluate(() =>
    document.querySelectorAll('.data-row');
  )

  /*await $('.data-row', html).each(function() => {
      const url = await $(this).find(".link-details a").attr("href");
      await page2.goto(url)
          .then(function() {
              const title = await page.evaluate(el => el.innerHTML, await page.$('#title'));
              // do other things 
          });
      // do other things 
      // create a json with data add it to a list  

  });*/
  
  for (const row of dataRows) {
    const url = dataRows.querySelector(".link-details a").href;
    await page2.goto(url)
    const title = await page2.evaluate(() => document.title)
    console.log(title)
  }
  
})()

Sign up to request clarification or add additional context in comments.

1 Comment

for (const row of dataRows) { const url = row.querySelector(".link-details a").href;... I'm getting the error TypeError: dataRows is not iterable
0

You can't await jQuery.each, to you can try doing the following.

const rows = await $('.data-row', html).toArray();

for(const row of rows){
    const url = await $(this).find(".link-details a").attr("href");
    page2.goto(url)
        .then(function() {
            const title = await page.evaluate(el => el.innerHTML, await page.$('#title'));
            // do other things 
        });
    // do other things 
    // create a json with data add it to a list
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.