Scraping javascript using python

Question

I am trying to scrape running routes, to geoprocess in R, from the following site: http://runkeeper.com/user/127244964/route/1149604

I am trying to do to that with this code:

from bs4 import BeautifulSoup

import urllib2
import csv
import os
import requests

page1 = urllib2.urlopen("http://runkeeper.com/user/212579518/route/513771")
soup = BeautifulSoup(page1)
print(soup)

When I print the results I see that the data that I need is on a text/javascript:

var routePoints = [{"latitude":38.918704,"longitude":-77.036478,"deltaDistance":0,"type":"StartPoint","altitude":40,"deltaPause":0}

I need to scrape the variables inside the dictionary. Any suggestions on how to do this?

Thanks.

Brad Culberson · Accepted Answer · 2014-02-25 03:32:51Z

1

This will search the soup data with regex and load it into an object for your usage.

import re
import json

point_re = re.compile('.*routePoints =(.*);')
point_json = point_re.search(str(soup)).group(1)
point_data = json.loads(point_json)

answered Feb 25, 2014 at 3:32

Brad Culberson

1,6871 gold badge11 silver badges2 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

asado23 Over a year ago

Thanks, this seems to get all the points that I need. If I wanted to save this to a csv file what would be your suggestion? Also if you have any suggestions on a good tutorial for BeautifulSoup tutorial/book I would appreciate it.

Brad Culberson Over a year ago

you could use docs.python.org/2/library/csv.html but it is just as easy to open a file and write the lines you want as long as you are just dumping numerics it will be pretty easy.

Amadan · Accepted Answer · 2014-02-25 02:54:44Z

0

Use regexp to strip everything outside the square brackets (or alternately, to only select the content of the outermost brackets), then use json.loads on the brackets.

answered Feb 25, 2014 at 2:54

Amadan

200k23 gold badges252 silver badges321 bronze badges

Collectives™ on Stack Overflow

Scraping javascript using python

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related