0

I am parsing a JSON document in Python and I have gotten nearly the whole process to work except I am having trouble converting a GPS string into the correct form.

I have the following form:

"gsx$gps":{"$t":"44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)"}

and that is from this HTML form:

44°21′N 68°13′W / 44.35°N 68.21°W / 44.35; -68.21 (Acadia)

and I want the final product to be a string that looks like this:

(44.35, -68.21)

here are a few other example JSON strings just to give you some more to work with:

"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}

"gsx$gps":{"$t":"38°41′N 109°34′W\ufeff / \ufeff38.68°N 109.57°W\ufeff / 38.68; -109.57\ufeff (Arches)"}

I have the following:

GPSlocation = entry['gsx$gps']['$t']

and then I don't know how to get GPSlocation into the form that I want above.

1
  • 1
    this isnt parsing json ... he already has a json built dict... its just parsing the format above to a tuple ... Commented Oct 2, 2012 at 4:37

4 Answers 4

1

not super elegant but it works...also you are not parsing json ... just parsing a string...

import re
center_part = GPSLocation.split("/")[1]
N,W = centerpart.split()
N,W = N.split("\xb0")[0],W.split("\xb0")[0]
tpl = (N,W)
print tpl

on a side note these are not ints ...

Sign up to request clarification or add additional context in comments.

1 Comment

alright great. yes I am just parsing a string. this gives me what I need but what exactly is the \xb0 signifying?
1

Here we go:

import json
jstr = """{"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}}"""
a = json.loads(jstr)
tuple(float(x) for x in a['gsx$gps']['$t'].split('/')[-1].split(u'\ufeff')[0].split(';'))

Gives:

(-14.25, -170.68)

Or from the plain string:

GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"
tuple(float(x) for x in GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))

Some timeit fancy, why to avoid fancy regexp ;)

import re
import timeit
setup='GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"; import re'
print timeit.timeit("map(float, GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))", setup=setup)
print timeit.timeit("map(float, re.findall(r'(-?\d+(?:\.\d+)?)', GPSlocation)[-2:])", setup=setup)

5.89355301857
22.6919388771

4 Comments

with GPSlocation all I have is this string: "14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)" but I suppose if I go back a step this works
Just ignore the first two lines and replace a['gsx$gps']['$t'] with GPSlocation.
the only problem I am having which I was having originally is that it isn't doing anything about the degree symbol and it cannot encode that
Well, you have to enable unicode. Also remember the little u'asdf' in front of the strings. How you import the unicode string correctly depends on your data source. json.loads creates unicode automatically for example.
0

You can extract the data with regex:

>>> import re
>>> text = '''"gsx$gps":{"$t":"44?21?N 68?13?W\ufeff / \ufeff44.35?N 68.21?W\ufeff / 44.35; -68.21\ufeff (Acadia)"}'''
>>> map(float, re.findall(r'(-?\d+(?:\.\d+)?)', text)[-2:])
[44.35, -68.21]

Comments

0
re.sub(r'.+/ (-?\d{1,3}\.\d\d); (-?\d{1,3}\.\d\d)\\.+',
       "(\g<1>, \g<2>)",
       "44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)")

1 Comment

This seems to have some issues, when you feed in unicode strings. Besides that, I don't think that the idea was, to output the values as string, but to get a tuple, where you can actually work with.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.