1

I'm working on a personal project visualizing location data, and here I reverse geocode location data from Google through the Geocoding API, by feeding it coordinates and retrieving a City Name and Country.

It's a CSV file, with 2 columns: "Location" (Latitude and Longitude) and "Time" (Date+Time). There are 8533 rows.

Sample Data:

    Location                Time
--------------------------------------------------
| 41.2911084,2.0779035 | 4/15/2015 10:58         |
--------------------------------------------------
| 41.2885014,2.0725591 | 4/15/2015 10:07         |
--------------------------------------------------
| 41.3484125,2.1442487 | 4/15/2015 9:56          |
--------------------------------------------------

I'm having a problem with the API where I keep getting an error. Let me show the code, first.

# import necessary modules
import pandas as pd
import json, requests, logging

# configure logging for our tool
lfh = logging.FileHandler('reverseGeocoder.log')
lfh.setFormatter(logging.Formatter('%(levelname)s %(asctime)s %(message)s'))
log = logging.getLogger('reverseGeocoder')
log.setLevel(logging.INFO)
log.addHandler(lfh)

# load the gps coordinate data
df = pd.read_csv('LocationHistory.csv')

# create new columns
df['geocode_data'] = ''
df['city'] = ''
df['country'] = ''


df.head()

# function that handles the geocoding requests
def reverseGeocode(latlng):

    result = {}
    url = 'https://maps.googleapis.com/maps/api/geocode/json?latlng={0}&key={1}'
    apikey = 'API_KEY_GOES_HERE'

    request = url.format(latlng, apikey)
    log.info(request)
    data = json.loads(requests.get(request).text)
    log.info(data)
    result = data['results'][0]['address_components']
    return {
        'city': result[3]['long_name'],
        'country': result[6]['long_name']
    }

# comment out the following line of code to geocode the entire dataframe
#df = df.head()

for i, row in df.iterrows():
    # for each row in the dataframe, geocode the lat-long data
    revGeocode = reverseGeocode(df['Location'][i])
    df['geocode_data'][i] = revGeocode
    df['city'] = revGeocode['city']
    df['country'] = revGeocode['country']


    # once every 100 loops print a counter
    #if i % 100 == 0: 
    print i

df.head()

df.to_csv('LocationHistory2.csv', encoding='utf-8', index=False)

The error in question that I keep receiving:

Traceback (most recent call last):
  File "D:\...\ReverseGeocoding.py", line 45, in <module>
    revGeocode = reverseGeocode(df['Location'][i])
  File "D:\...\ReverseGeocoding.py", line 37, in reverseGeocode
    'country': result[6]['long_name']
IndexError: list index out of range

I think that part of the problem is that I need a check in place, in-case the API doesn't return anything for the locations. Why it wouldn't return anything, I have no idea.

I'm quite new to the world of APIs (and Python), but how could I get this code to a running state?

2 Answers 2

2

You probably want to run a check on the types key for the address attribute you want. So try something like;

    result = data['results'][0]['address_components']
    city = ''
    country = ''

    for item in result:
        if 'administrative_area_level_1' in item[types]:
            city = item['long_name']
        elif 'country' in item[types]:
            country = item['long_name']
    return {
        'city': city,
        'country': country
    }
Sign up to request clarification or add additional context in comments.

1 Comment

this worked for me. just changed item[type] to item['type']
1

I think that part of the problem is that I need a check in place, in-case the API doesn't return anything for the locations.

Indeed. The first thing you want to do is to put your requests call in a try/except block to catch possible exceptions during the request phase (and there are quite a few things that can go wrong when doing an HTTP request).

BTW you don't have to build the querystring manually - requests takes care of it in a safer way (escaping etc), and you'll still have access to the resulting url in the response object if you want it. So as a starter you want:

url = 'https://maps.googleapis.com/maps/api/geocode/json'
apikey = 'API_KEY_GOES_HERE'
try:
    response = requests.get(url, params={"key":apikey, "latlng":latlng})
except requests.exceptions.RequestException as e:
    # this will log the whole traceback
    logger.exception("call failed with %s", e)
    # here you either re-raise the exception, raise your own exception,
    # or return anything
    return None

Now you also want to check the response's status code - anything else than 200 means you don't have your data

if response.status_code != 200:
    logger.error("got status code %s", response.status_code)
    # idem, either raise your own exception or
    # return anything
    return None

FWIW, response has a raise_for_status() method that will raise a RequestException if you get a 4XX or 5XX response, so you can simplify the whole thing to:

try:
    response = requests.get(url, params={"key":apikey, "latlng":latlng})
    response.raise_for_status()
except requests.exceptions.RequestException as e:
    # this will log the whole traceback
    logger.exception("call failed with %s", e)
    # here you either re-raise the exception, raise your own exception,
    # or return anything
    return None

Now you can expect you do have a valid response, so let's get our json data. Here again, requests already provides a shortcut. Note that if your response's content type is not 'application/json' or the response's content is invalid json, you'll get a ValueError but well, I think we can trust google to do the job here ;)

data = response.json()

I don't remember exactly the whole geocoding API so should really double-check the doc but IIRC as long as you got a 200, you should have some valid data.

"Why it wouldn't return anything, I have no idea."

Connection lost, API limits, server down (yes it happens), there are a lot of possible reasons. With the above code you should at least get a hint.

Now you may still not have all you expect in the resulting data - here again, check the docs, manually replay the requests for geoloc that failed and inspect the response and data etc.

2 Comments

Thanks for the quick reply, I keep getting indentation errors when I try to put the try/except block within the reverseGeocode() beneath the apikey. It should work, but keeps saying unexpected indent. What am I doing wrong?
Edit: Fixed that, but keep getting my previous error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.