Parsing Strings from json in Python

Question

I am parsing a JSON document in Python and I have gotten nearly the whole process to work except I am having trouble converting a GPS string into the correct form.

I have the following form:

"gsx$gps":{"$t":"44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)"}

and that is from this HTML form:

44°21′N 68°13′W / 44.35°N 68.21°W / 44.35; -68.21 (Acadia)

and I want the final product to be a string that looks like this:

(44.35, -68.21)

here are a few other example JSON strings just to give you some more to work with:

"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}

"gsx$gps":{"$t":"38°41′N 109°34′W\ufeff / \ufeff38.68°N 109.57°W\ufeff / 38.68; -109.57\ufeff (Arches)"}

I have the following:

GPSlocation = entry['gsx$gps']['$t']

and then I don't know how to get GPSlocation into the form that I want above.

this isnt parsing json ... he already has a json built dict... its just parsing the format above to a tuple ... — Joran Beasley
– Joran Beasley, Commented Oct 2, 2012 at 4:37

Joran Beasley · Accepted Answer · 2012-10-02 04:41:08Z

1

not super elegant but it works...also you are not parsing json ... just parsing a string...

import re
center_part = GPSLocation.split("/")[1]
N,W = centerpart.split()
N,W = N.split("\xb0")[0],W.split("\xb0")[0]
tpl = (N,W)
print tpl

on a side note these are not ints ...

edited Oct 2, 2012 at 4:41

answered Oct 2, 2012 at 4:28

Joran Beasley

114k13 gold badges167 silver badges187 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

clifgray Over a year ago

alright great. yes I am just parsing a string. this gives me what I need but what exactly is the \xb0 signifying?

Michael · Accepted Answer · 2012-10-02 04:58:35Z

1

Here we go:

import json
jstr = """{"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}}"""
a = json.loads(jstr)
tuple(float(x) for x in a['gsx$gps']['$t'].split('/')[-1].split(u'\ufeff')[0].split(';'))

Gives:

(-14.25, -170.68)

Or from the plain string:

GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"
tuple(float(x) for x in GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))

Some timeit fancy, why to avoid fancy regexp ;)

import re
import timeit
setup='GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"; import re'
print timeit.timeit("map(float, GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))", setup=setup)
print timeit.timeit("map(float, re.findall(r'(-?\d+(?:\.\d+)?)', GPSlocation)[-2:])", setup=setup)

5.89355301857
22.6919388771

edited Oct 2, 2012 at 4:58

answered Oct 2, 2012 at 4:38

Michael

7,8061 gold badge41 silver badges64 bronze badges

4 Comments

clifgray Over a year ago

with GPSlocation all I have is this string: "14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)" but I suppose if I go back a step this works

Michael Over a year ago

Just ignore the first two lines and replace a['gsx$gps']['$t'] with GPSlocation.

clifgray Over a year ago

the only problem I am having which I was having originally is that it isn't doing anything about the degree symbol and it cannot encode that

Michael Over a year ago

Well, you have to enable unicode. Also remember the little u'asdf' in front of the strings. How you import the unicode string correctly depends on your data source. json.loads creates unicode automatically for example.

Blender · Accepted Answer · 2012-10-02 04:45:41Z

0

You can extract the data with regex:

>>> import re
>>> text = '''"gsx$gps":{"$t":"44?21?N 68?13?W\ufeff / \ufeff44.35?N 68.21?W\ufeff / 44.35; -68.21\ufeff (Acadia)"}'''
>>> map(float, re.findall(r'(-?\d+(?:\.\d+)?)', text)[-2:])
[44.35, -68.21]

answered Oct 2, 2012 at 4:45

Blender

300k55 gold badges462 silver badges511 bronze badges

Comments

Need4Steed · Accepted Answer · 2012-10-02 05:20:08Z

0

re.sub(r'.+/ (-?\d{1,3}\.\d\d); (-?\d{1,3}\.\d\d)\\.+',
       "(\g<1>, \g<2>)",
       "44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)")

answered Oct 2, 2012 at 5:20

Need4Steed

2,1704 gold badges22 silver badges30 bronze badges

1 Comment

Michael Over a year ago

This seems to have some issues, when you feed in unicode strings. Besides that, I don't think that the idea was, to output the values as string, but to get a tuple, where you can actually work with.

Collectives™ on Stack Overflow

Parsing Strings from json in Python

4 Answers 4

1 Comment

4 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

4 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related