1

I am trying to convert JSON (from AirTable) to dataframe that I can use for further data transform.

I ran into issue after I convert the JSON to dataframe that one of the value in the column has nested list.

This is sample dataframe after I flatten out w/o realizing that "Package" contains a nested list from its original JSON list.


|                    | Name                 |Source                                     |
| -------------------| ---------------------|-------------------------------------------|
|rec2mxAycpaC93jfz   | Luis Downes          |[Canceled - Lv1]                           |
|recIQ0HfCmRhUclti   | Milana Whitehouse    |[Canceled - Lv1,2019 - Lv2,2020 - Lv1]     |
|recOFVz0eajFblTzL   | Fatma Mayo           |[Canceled - Lv1,2019 - Lv4,2020 - Lv2]     |

This is sample JSON, the package is the data field that has a nested list and I would like to flatten it out.

[{'id': 'rec2mxAycpaC93jfz',
 'fields': {'Name': 'Luis Downes',
             'Package': ['Canceled - Lv1']},
 'createdTime': '2017-08-25T17:05:45.000Z'},
{'id': 'recIQ0HfCmRhUclti',
 'fields': {'Name': 'Milana Whitehouse',
             Package': ['Canceled - Lv1', '2019 - Lv2', '2020 - Lv1']},
 'createdTime': '2017-08-25T17:05:46.000Z'},
{'id': 'recOFVz0eajFblTzL',
 'fields': {'Name': 'Fatma Mayo',
            Package': ['Canceled - Lv1', '2019 - Lv4', '2020 - Lv2']},
 'createdTime': '2017-08-25T17:05:47.000Z'}]
]

Any idea on how to flat the entire JSON? I have tried couple solutions I found, including this one but it only flatten the first record into single line.

# flattening JSON objects of arbitrary structure

def flatten_json(y):
    out = {}

    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x

    flatten(y)
    return out

The end result(either JSON or dataframe) I want to achieve is listed below


|                    | Name                 |Package- Canceled - Lv1 |Package- 2019 - Lv2 |Package- 2020 - Lv1 |Package- 2019 - Lv4 |Package- 2020 - Lv2 |                                          |
| -------------------| ---------------------|------------------------|--------------------|--------------------|--------------------|--------------------|
|rec2mxAycpaC93jfz   | Luis Downes          |1                       |0                   |0                   |0                   |0                   |
|recIQ0HfCmRhUclti   | Milana Whitehouse    |1                       |1                   |1                   |0                   |0                   |
|recOFVz0eajFblTzL   | Fatma Mayo           |1                       |0                   |0                   |1                   |1                   |

Thank you in advance for your help here!

1 Answer 1

1

Via json_normalize() and get_dummies():

d = [{'id': 'rec2mxAycpaC93jfz',
 'fields': {'Name': 'Luis Downes',
             'Package': ['Canceled - Lv1']},
 'createdTime': '2017-08-25T17:05:45.000Z'},
{'id': 'recIQ0HfCmRhUclti',
 'fields': {'Name': 'Milana Whitehouse',
             'Package': ['Canceled - Lv1', '2019 - Lv2', '2020 - Lv1']},
 'createdTime': '2017-08-25T17:05:46.000Z'},
{'id': 'recOFVz0eajFblTzL',
 'fields': {'Name': 'Fatma Mayo',
            'Package': ['Canceled - Lv1', '2019 - Lv4', '2020 - Lv2']},
 'createdTime': '2017-08-25T17:05:47.000Z'}
]
 
df = pd.json_normalize(d)
dm = pd.get_dummies(df['fields.Package'].apply(pd.Series).stack()).sum(level=0)
pd.concat([df[['id','fields.Name']],dm], axis=1) 

                  id        fields.Name  2019 - Lv2  2019 - Lv4  2020 - Lv1  \
0  rec2mxAycpaC93jfz        Luis Downes           0           0           0   
1  recIQ0HfCmRhUclti  Milana Whitehouse           1           0           1   
2  recOFVz0eajFblTzL         Fatma Mayo           0           1           0   

   2020 - Lv2  Canceled - Lv1  
0           0               1  
1           0               1  
2           1               1  
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.