1

I have a JSON which is in nested form. I would like to extract billingcode and rate from json and put into csv using pandas python.

{
"TransactionId": "1c9cc4b9-a0e1-4228-b382-b244a7674593",
"Number": "1022",
"Version": 534,
"StartDateTime": "2023-04-01 14:12:50.999",
"EndDateTime": "2023-04-01 14:15:32.038",
"LastUpdatedOn": "2023-04-01",
"RateSchedules": [{
        "BillingCodeType": "CPT",
        "BillingCodeTypeVersion": "2023",
        "BillingCode": "0107T",
        "Rates": [{
                "Rate": 14.11,
                "PlaceOfServices": ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "31", "32", "33", "34", "41", "42", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "60", "61", "62", "65", "71", "72", "81", "99"]
            }
        ]
    }, {
        "BillingCodeType": "CPT",
        "BillingCodeTypeVersion": "2023",
        "BillingCode": "0108T",
        "Rates": [{
                "Rate": 14.11,
                "PlaceOfServices": ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "31", "32", "33", "34", "41", "42", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "60", "61", "62", "65", "71", "72", "81", "99"]
            }
        ]
    }

}

Above is my JSON and here is my python script. I am running it from Spyder (Python 3.11) from Anaconda that has Pandas

import pandas as pd
df = pd.read_json(r"C:\Users\Jay.Ramachandra\Documents\20230401- 
141250_1022_RI_MC01.json")
df = df.loc[["BillingCode", ["Rate"], "Rates"],"RateSchedules"].T
df.to_csv(r"C:\Users\Jay.Ramachandra\Documents\MrfJson.csv")

I am referencing this post here to write my python Convert Json to CSV using Python

I ran into this and this is now corrected using r"

runfile('C:/Users/.spyder-py3/temp.py', wdir='C:/Users/.spyder-py3')
  File <unknown>:9
    df = pd.read_json("C:\Users\MC01.json")
                                                                                            ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

What am i doing wrong? how to correct this?

Addendum

I edited my original question to add r" in my code and now run into

TypeError: unhashable type: 'list'
3
  • Try adding r in front of the path df = pd.read_json(r"C:\Users\MC01.json") and df.to_csv(r"C:\Users\MC01Json.csv") Commented Oct 3, 2023 at 14:06
  • stackoverflow.com/questions/2953834/… Commented Oct 3, 2023 at 14:06
  • @Daviid - I tried it and also \\ but now run into TypeError: unhashable type: 'list' Commented Oct 3, 2023 at 14:24

2 Answers 2

2

You can use the in-built json_normalize function to achieve this. The .json you have pasted is not formatted correctly as well.

Assuming the JSON file is formatted correctly and saved as raw.json, the following code block should give you a Pandas DataFrame

import json
with open('raw.json', 'r') as f:
    d = json.load(f)

# Read the RateSchedules part of the JSON file
d = pd.json_normalize(d['RateSchedules'])

# Extract the first item in the list of Rates
d['Rates'] = d['Rates'].apply(lambda x: x[0]['Rate'])

# Save as CSV
d.to_csv("./MrfJson.csv", index=False)

Alternatively, as JonSG correctly pointed out, you can extend it using record_path and meta:

import json
with open('raw.json', 'r') as f:
    d = json.load(f)

# Read the RateSchedules part of the JSON file
d = pd.json_normalize(
    d['RateSchedules'],
    record_path=["RateSchedules", "Rates"],
    meta=[["RateSchedules", "BillingCode"]]
)[["RateSchedules.BillingCode", "Rate"]]

# Save as CSV
d.to_csv("./MrfJson.csv", index=False)
Sign up to request clarification or add additional context in comments.

Comments

0
import json
import pandas as pd
dr = 'C:\\Users\\Documents\\'
dr += 'MC01.json'
with open(dr, 'r') as f:
    jsn = json.load(f)
print(len(jsn['RateSchedules']))

 

df = pd.json_normalize(jsn['RateSchedules'])
rateList = []
for i in df.index:
    for z in df['Rates'][i]:
        rateList.append((jsn['RateSchedules'][i]['BillingCode'],z['Rate']))
pd.DataFrame(rateList, columns=['BillingCode','Rate']).to_csv('C:\\Users\\MrfJson.csv',index=False)
print(len(rateList))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.