1

I'm trying to store API output into CSV/db and can not figure out how I can make for those Key in "tierList". One row in my case should be on bin and I need key as a columns in my output. Is it possible to do with pd.JSON_Normalize ? Please direct me to the right lib/tool. Thanks to all.

Please refer to compact test python script below. I don't understand why I can only use record_path='memberList. Anything else gives an error. According all theory I should be able to use record_path=memberRiskData and add rest of columns with meta.

import json
import os
import pandas as pd

json_file = '''
{  "content":"BIN REST",   "riskMonth":"20250401",   "pagination":{      "currentPage":1,      "totalPages":26   },
   "memberList":[
      {  "bin":"22222","firstName":"MARIA", "lastName":"PLACARD",
         "memberRiskData":{
              "strata":"East",  "postParameter":"",  
              "tierList":[
               { "riskTier":"AdverseSubdomainTier",
                 "tierValue":"High"               },
               {  "riskTier":"SocialDomainTier",
                  "tierValue":"Med"               }            ]         }      }   ]} '''

data = json.loads(json_file)
print('.......type =',type(data))
print(data.items())
print(data['memberList'][0])
df = pd.json_normalize(data, record_path='memberList') # , meta=['strata','content']) TBD....

print (df) 
df.to_csv('c:/out.csv', index=False)

My current output is below. Somehow I need to break column memberRiskData.tierList into few for each key. enter image description here

And this is my desired output: enter image description here

3
  • 4
    This is less of a Python problem than a logic problem. Describe to yourself, in English (or whatever your preferred language is) how you want the data to be represented. Then describe what needs to change from what you have. After that, writing the code should be relatively simple. Commented Nov 15 at 19:08
  • Do you think it can be done with json_normalize Commented Nov 15 at 19:41
  • Don't understand why comment above was so liked, my question is perfect, Commented yesterday

1 Answer 1

2

The challenge you have here is that the JSON is turned into three different levels of structure that you want to handle differently.

  • First level (list): "memberList" should become the rows of your CSV

  • Second level (dict): "memberRiskData" should (for the non-list values of the dict) become columns named based on the keys in the dict

  • Third level (list): "tierList" should become columns named for the values indexed by one dict key, with values taken from the values indexed by the other dict key.

There isn't a function that will do that for you all in one step, so pandas is likely not going to help you much more than just writing the CSV.

Here's how I would do it, using native Python for the manipulation:

def processTierList(o):
    return {i['riskTier']: i['tierValue'] for i in o}

def processRiskData(o):
    return {k:o[k] for k in o.keys() if not k == 'tierList'} | processTierList(o['tierList'])

def processMember(o):
    return {k:o[k] for k in o.keys() if not k == 'memberRiskData'} | processRiskData(o['memberRiskData'])

Then processMember will handle each row to produce a flat dict, and the resulting list can then be written to a CSV either with the standard library module or with pandas.

json_file = ''' ... '''
data = json.loads(json_file)
output = [processMember(member) for member in data['memberList']]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks so much Mr. Richard !! Problem solved.
Amazing, thanks again Richard, I also will try to append 0 level content to each row

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.