1

when i try to scrape a reactjs website using nodejs i am getting the content of index.html file only not the tags that were used in the website. Here is what i have tried -

    const request = require("request");
    const cheerio = require("cheerio");

    const URL = "https://pydata-jal.netlify.com/";

    request(URL, (err, res, body) => {
      if (!err && res.statusCode == 200) {
        const $ = cheerio.load(body);
        console.log($.html());
      }
    });

What should i do to get the whole of tags that were used in react website.

And do tell i can scrape the hackernoon website ? (for just example) if its legal?

1 Answer 1

1

Cheerio parses only already rendered HTML (eg: static HTML) In order to get the React render you should rely on headless browsers controlled with tools like Puppeteer

Sign up to request clarification or add additional context in comments.

2 Comments

means we can never scrap a react website using cheerio??
Yes, cheerios parses the html content and let you access to nodes in jQuery fashion. React needs a browser core in order to render correctly (javascript has to be executed and DOM manipulated and reconciled with Virtual DOM)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.