Scraping HTML data from website in Python

Question

I'm trying to scrape certain pieces of HTML data from certain websites, but I can't seem to scrape the parts I want. For instance I set myself the challenge of scraping the number of followers from this blog, but I can't seem to do so.

I've tried using urllib, request, beautifulsoup as well as Jam API.

Here's what my code looks like at the moment:

from bs4 import BeautifulSoup
from urllib import urlopen
import json
import urllib2

html = urlopen('http://freelegalconsultancy.blogspot.co.uk/')
soup = BeautifulSoup(html, "lxml")
print soup

How would I go about pulling the number of followers in this instace?

jmoz · Accepted Answer · 2016-10-12 11:02:46Z

1

You can't grab the followers as it's a widget loaded by javascript. You need to grab parts of the html by css class or id or by the element.

E.g:

from bs4 import BeautifulSoup
from urllib import urlopen

html = urlopen('http://freelegalconsultancy.blogspot.co.uk/')
soup = BeautifulSoup(html)

assert soup.h1.string == '\nLAW FOR ALL-M.MURALI MOHAN\n'

answered Oct 12, 2016 at 11:02

jmoz

8,0165 gold badges33 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Scraping HTML data from website in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related