Convert Nested JSON to CSV using Pandas

Question

I am trying to convert nested JSON into CSV using pandas. I have viewed similar questions asked here but I can't seem apply in on my scenario. My JSON is the following

{
 "51% FIFTY ONE PERCENT(PWD)" : {
 "ID" : "51%1574233975114-WEBAD",
 "contactName" : "",
 "createdAt" : 1574233975,
 "debit" : 118268.19999999995,
 "defaultCompany" : "",
 "emailAddress" : "",
 "lastUpdatedAt" : "",
 "phoneNumber" : "",
 "taskNumber" : 0
},
 "51% STORE (MUZ)" : {
 "ID" : "51%1576650784631-WEBAD",
 "contactName" : "",
 "createdAt" : 1576650784,
 "debit" : 63860,
 "defaultCompany" : "",
 "emailAddress" : "",
 "lastUpdatedAt" : "",
 "phoneNumber" : "",
 "taskNumber" : 0
},
 "ABBOTT S" : {
  "STORE (ABD)" : {
   "ID" : "ABB1574833257715-WEBAD",
   "contactName" : "",
   "createdAt" : 1574833257,
   "debit" : 35065,
   "defaultCompany" : "",
   "emailAddress" : "",
   "lastUpdatedAt" : "",
   "phoneNumber" : "",
   "taskNumber" : 0
 }
}
}

This is a snippet of the JSON and as you can see some entries, not all, are nested. I tried using the json_normalize the following way i.e.

import json
from pandas.io.json import json_normalize  

with open('.\Customers\kontrolkotlin-CUSTOMERS-export.json') as f:
d = json.load(f)

nycphil = json_normalize(data = d)
nycphil

And got a single row dataframe as output as shown below This doesn't seem to work out as I want to something readable and understandable.

foglerit · Accepted Answer · 2020-03-20 14:00:50Z

1

I'm sure there's a simpler say, but...

If you assume that the leafs of your nested JSON all have the same fields (ID, contactName, etc...), then you can recursively flatten your JSON and create a list of records, keeping the path that took you to the leaf.

Something like:

def flatten_json(x, path="", result=None):
    if result is None:
        result=[]
    if "ID" in x:
        result.append({**x, "path": path})
        return
    for key in x:
        flatten_json(x[key], path + "/" + key, result)
    return result

df = pd.DataFrame(flatten_json(data))
print(df)

result:

                       ID contactName   createdAt     debit defaultCompany  \
0  51%1574233975114-WEBAD              1574233975  118268.2                  
1  51%1576650784631-WEBAD              1576650784   63860.0                  
2  ABB1574833257715-WEBAD              1574833257   35065.0                  

  emailAddress lastUpdatedAt phoneNumber  taskNumber  \
0                                                  0   
1                                                  0   
2                                                  0   

                          path  
0  /51% FIFTY ONE PERCENT(PWD)  
1             /51% STORE (MUZ)  
2        /ABBOTT S/STORE (ABD)

edited Mar 20, 2020 at 14:00

answered Mar 20, 2020 at 10:59

foglerit

8,4469 gold badges52 silver badges70 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Zohair Abbas Hadi Over a year ago

I can't understand the parse_json function you mentioned there

foglerit Over a year ago

Sorry, @ZohairAbbasHadi that was the wrong function name. I fixed the code now.

Nick Bond · Accepted Answer · 2020-03-20 10:03:26Z

0

You can take direct data like:

nycphil = json_normalize(d['51% STORE (MUZ)'])
nycphil.head(3)
print(nycphil.head(3))

Or try to do something like this

df = read_json('some.json')
df.to_csv() 
print(df)

output

answered Mar 20, 2020 at 10:03

Nick Bond

1718 bronze badges

1 Comment

Zohair Abbas Hadi Over a year ago

Bu there is a problem with your answer because the last row is showing the nested json as it is

Collectives™ on Stack Overflow

Convert Nested JSON to CSV using Pandas

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related