Extracting specific JSON property into CSV using Python

Question

I have a JSON which is in nested form. I would like to extract billingcode and rate from json and put into csv using pandas python.

{
"TransactionId": "1c9cc4b9-a0e1-4228-b382-b244a7674593",
"Number": "1022",
"Version": 534,
"StartDateTime": "2023-04-01 14:12:50.999",
"EndDateTime": "2023-04-01 14:15:32.038",
"LastUpdatedOn": "2023-04-01",
"RateSchedules": [{
        "BillingCodeType": "CPT",
        "BillingCodeTypeVersion": "2023",
        "BillingCode": "0107T",
        "Rates": [{
                "Rate": 14.11,
                "PlaceOfServices": ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "31", "32", "33", "34", "41", "42", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "60", "61", "62", "65", "71", "72", "81", "99"]
            }
        ]
    }, {
        "BillingCodeType": "CPT",
        "BillingCodeTypeVersion": "2023",
        "BillingCode": "0108T",
        "Rates": [{
                "Rate": 14.11,
                "PlaceOfServices": ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "31", "32", "33", "34", "41", "42", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "60", "61", "62", "65", "71", "72", "81", "99"]
            }
        ]
    }

}

Above is my JSON and here is my python script. I am running it from Spyder (Python 3.11) from Anaconda that has Pandas

import pandas as pd
df = pd.read_json(r"C:\Users\Jay.Ramachandra\Documents\20230401- 
141250_1022_RI_MC01.json")
df = df.loc[["BillingCode", ["Rate"], "Rates"],"RateSchedules"].T
df.to_csv(r"C:\Users\Jay.Ramachandra\Documents\MrfJson.csv")

I am referencing this post here to write my python Convert Json to CSV using Python

I ran into this and this is now corrected using r"

runfile('C:/Users/.spyder-py3/temp.py', wdir='C:/Users/.spyder-py3')
  File <unknown>:9
    df = pd.read_json("C:\Users\MC01.json")
                                                                                            ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

What am i doing wrong? how to correct this?

Addendum

I edited my original question to add r" in my code and now run into

TypeError: unhashable type: 'list'

Try adding r in front of the path df = pd.read_json(r"C:\Users\MC01.json") and df.to_csv(r"C:\Users\MC01Json.csv") — Daviid
– Daviid, Commented Oct 3, 2023 at 14:06
@Daviid - I tried it and also \\ but now run into TypeError: unhashable type: 'list' — user176047
– user176047, Commented Oct 3, 2023 at 14:24

lakshyaag · Accepted Answer · 2023-10-03 14:56:01Z

You can use the in-built json_normalize function to achieve this. The .json you have pasted is not formatted correctly as well.

Assuming the JSON file is formatted correctly and saved as raw.json, the following code block should give you a Pandas DataFrame

import json
with open('raw.json', 'r') as f:
    d = json.load(f)

# Read the RateSchedules part of the JSON file
d = pd.json_normalize(d['RateSchedules'])

# Extract the first item in the list of Rates
d['Rates'] = d['Rates'].apply(lambda x: x[0]['Rate'])

# Save as CSV
d.to_csv("./MrfJson.csv", index=False)

Alternatively, as JonSG correctly pointed out, you can extend it using record_path and meta:

import json
with open('raw.json', 'r') as f:
    d = json.load(f)

# Read the RateSchedules part of the JSON file
d = pd.json_normalize(
    d['RateSchedules'],
    record_path=["RateSchedules", "Rates"],
    meta=[["RateSchedules", "BillingCode"]]
)[["RateSchedules.BillingCode", "Rate"]]

# Save as CSV
d.to_csv("./MrfJson.csv", index=False)

user176047 · Accepted Answer · 2023-10-03 19:00:08Z

0

import json
import pandas as pd
dr = 'C:\\Users\\Documents\\'
dr += 'MC01.json'
with open(dr, 'r') as f:
    jsn = json.load(f)
print(len(jsn['RateSchedules']))

 

df = pd.json_normalize(jsn['RateSchedules'])
rateList = []
for i in df.index:
    for z in df['Rates'][i]:
        rateList.append((jsn['RateSchedules'][i]['BillingCode'],z['Rate']))
pd.DataFrame(rateList, columns=['BillingCode','Rate']).to_csv('C:\\Users\\MrfJson.csv',index=False)
print(len(rateList))

answered Oct 3, 2023 at 19:00

user176047

3814 silver badges24 bronze badges

Collectives™ on Stack Overflow

Extracting specific JSON property into CSV using Python

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related