1

I am creating a web scraping programme written in javascript, using request and cheerio. The webpage I'm trying to extract contains javascript within the html. It is the javascript that I'm interested in, however can't find a way to access it. Is there a way to extract the javascript, using cheerio?

Many thanks for any suggestions, I've just started with web scraping.

My code is:

var request = require('request');
var cheerio = require('cheerio');

var credentials = {
    username: 'username',
    password: 'password'
};

request.post({
    uri: 'http://webpage',
    headers: { 'content-type': 'application/x-www-form-urlencoded' },
    body: require('querystring').stringify(credentials)
}, function(err, res, body){
if(err) {
    callback.call(null, new Error('Login failed'));
    return;
}

request('http://webpage', function(err, res, body)
{
    if(err) {
        callback.call(null, new
            Error('Request failed'));
        return;
    }

    var $ = cheerio.load(body);
    var text = $('#element').text();
    console.log($.html());

}); 

});

1 Answer 1

2

If you're looking for the javascript inside the webpage, you can use cheerio to collect all <script> tags from the html and then get the content from them.

var scripts = [];

request('http://webpage', function(err, res, body)
{
  if(err) {
    callback.call(null, new Error('Request failed'));
    return;
  }

  var $ = cheerio.load(body);
  $('script').each(function(i, element) {
    scripts[i] = $(element).text();
  }   
});

You'll now have an array with all available javascript in the HTML. Now if they are imported javascript, then you won't get any content. You can search if the element has a src url.

...

$('script').each(function(i, element) {
  if ($(element).attr('src') === undefined) {
    scripts[i] = $(element).text();
  }
  else {
    // Collect or ignore this.
  }
}

...

I haven't tested this, but it should work based on cheerio's documentation.

https://github.com/cheeriojs/cheerio

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.