0

I'm querying an API and would like to store the results of that API, the JSON response looks something like this:

{
    "name": "John Smith", 
    "id": 123456, 
    "geo": {
        "region": {
            "city": false
        },
        "country": "United States",
        "ethnicity": "White",
      }
}

The dictionary I'm trying to parse it into looks like this:

dictP = {
        "name": "",
        "id": 0,
        "geo.region.city": "",
        "geo.country": "",
        "geo.ethnicity": ""
}

There's a lot more data, about 120 data points being returned, lots of nested/non nested; so I'm excluding a big chunk of data thats useless to me. Only extracting what i need. The issue is sometimes the data is missing, IE:

{
    "name": "John Smith", 
    "id": 123456, 
    "geo": {
        "region": null,
        "country": "United States",
        "ethnicity": "White",
      }
}

and:

{
    "name": "John Smith", 
    "id": 123456, 
    "geo": null
}

or:

{
    "name": "John Smith", 
    "id": null, 
    "geo": null
}

What's the best way to parse this? I have about 75 data points i want to parse, writing if/else statements or try/except statements 75 times does not make sense. The data needs to all be uniform because I'm saving to CSV, so ultimately I'd like to fill "None" for missing data, i can't seem to find a library that does this. Advice appreciated.

5
  • How do you parse it? This may help: stackoverflow.com/questions/64704478/… Commented Feb 6, 2022 at 3:02
  • @Asdoost Im not sure how to use json.get() on nested keys, im also not sure how to pass the key values ie: test.something as a value to look for in a loop. Commented Feb 6, 2022 at 3:13
  • Does this answer your question? How to handle missing keys in list of JSON objects in Python Commented Feb 6, 2022 at 3:13
  • @Asdoost Nope, sadly it doesn't Commented Feb 6, 2022 at 3:14
  • Copy a piece of json file and what you want to get. People will help you. Commented Feb 6, 2022 at 3:17

1 Answer 1

2

How about FlatDict?.

import json
from flatdict import FlatDict

response_as_json = json.loads(response_as_text)  # convert into python dictionary
result = dict(FlatDict(response_as_json, delimiter='.'))

Above output a python dictionary looks like this:

{'geo.country': 'United States',
 'geo.ethnicity': 'White',
 'geo.region': None,
 'id': 123456,
 'name': 'John Smith'}

EDIT:

I forgot the part about CSV. Once you have flat dictionaries, you can just put them into pandas.

>>> d1
{'name': 'John Smith', 'id': 123456, 'geo.region.city': False, 'geo.country': 'United States', 'geo.ethnicity': 'White'}
>>> d2
{'name': 'John Smith', 'id': 123456, 'geo.region': None, 'geo.country': 'United States', 'geo.ethnicity': 'White'}

>>> import pandas as pd
>>> list_of_dict = [d1, d2]  # Put as many as you want in this list.
>>> table = pd.DataFrame(list_of_dict)
>>> table
         name      id geo.region.city    geo.country geo.ethnicity  geo.region
0  John Smith  123456           False  United States         White         NaN
1  John Smith  123456             NaN  United States         White         NaN

>>> table.to_csv(path_to_csv)
Sign up to request clarification or add additional context in comments.

3 Comments

Amazing! Love this library already! Any idea how i can go about idiomatically filling missing keys? Actually! I think i can use this library, and pass the two dicts two a function, for loop, if error just replace with non!
I forgot the CSV part. I think pandas can handle that. I just edited my answer, please check it.
pandas will look up the intersection of the keys, but if there are any missing keys in the output table, you can fill with table['missing_key'] = None.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.