0

I'm trying to scrape certain pieces of HTML data from certain websites, but I can't seem to scrape the parts I want. For instance I set myself the challenge of scraping the number of followers from this blog, but I can't seem to do so.

I've tried using urllib, request, beautifulsoup as well as Jam API.

Here's what my code looks like at the moment:

from bs4 import BeautifulSoup
from urllib import urlopen
import json
import urllib2

html = urlopen('http://freelegalconsultancy.blogspot.co.uk/')
soup = BeautifulSoup(html, "lxml")
print soup

How would I go about pulling the number of followers in this instace?

1 Answer 1

1

You can't grab the followers as it's a widget loaded by javascript. You need to grab parts of the html by css class or id or by the element.

E.g:

from bs4 import BeautifulSoup
from urllib import urlopen

html = urlopen('http://freelegalconsultancy.blogspot.co.uk/')
soup = BeautifulSoup(html)

assert soup.h1.string == '\nLAW FOR ALL-M.MURALI MOHAN\n'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.