2

I have a few questions about JSON in Pandas! I have gotten the data loaded into a dataframe and can search effectively with the query code below. Three questions I have:

  1. How can I get the individual pieces of 'coord' (i.e. 'coord.lon')
  2. How do I assign specific columns to variables in the code? (i.e. state = df.loc[df['name'] == city, ['state'])?
  3. How can I add additional search terms? (i.e. df['name'] == city AND df['state'] == state)

Here is the code I have so far to pull the records I need:

query = df.loc[df['name'] == city, ['id', 'name', 'state', 'coord']]

Here is a sample of the JSON response I am trying to search:

{
    "id": 5074329,
    "name": "Oakland",
    "state": "NE",
    "country": "US",
    "coord": {
        "lon": -96.466972,
        "lat": 41.835831
    }
},
{
    "id": 5074472,
    "name": "Omaha",
    "state": "NE",
    "country": "US",
    "coord": {
        "lon": -95.93779,
        "lat": 41.25861
    }
}

As always, THANKS!!

edit: Here is a picture of a couple rows of the dataframe:

url = 'http://bulk.openweathermap.org/sample/city.list.json.gz'

df = pd.read_json(url)

enter image description here

2
  • Can you show the dataframe you are generating from the json file? Commented Jun 2, 2021 at 14:04
  • @JMA - Thanks for getting back to me. I have added a screenshot above. Commented Jun 2, 2021 at 14:19

1 Answer 1

3

If you have the dataframe like this:

        id     name state country                                  coord
0  5074329  Oakland    NE      US  {'lon': -96.466972, 'lat': 41.835831}
1  5074472    Omaha    NE      US    {'lon': -95.93779, 'lat': 41.25861}
  1. To access lon you can use the str method:
df['lon'] = df.coord.str['lon'] 

#output
        id     name state country                                  coord  \
0  5074329  Oakland    NE      US  {'lon': -96.466972, 'lat': 41.835831}   
1  5074472    Omaha    NE      US    {'lon': -95.93779, 'lat': 41.25861}   

         lon  
0 -96.466972  
1 -95.937790 
  1. I guess you wanna do boolean indexing here:
city =  'Omaha'
state = df.loc[df['name'] == city]['state']

The above code will return the pandas series. If you want only the 1st value use iloc:

city =  'Omaha'
state = df.loc[df['name'] == city]['state'].iloc[0]

# output
'NE'
  1. This is quite similar to the 2nd part:
city = 'Omaha'
state = 'NE'
suset_df = df.loc[(df['name'] == city) & (df['state'] == state)]

# OUTPUT
        id   name state country                                coord       lon
1  5074472  Omaha    NE      US  {'lon': -95.93779, 'lat': 41.25861} -95.93779
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.