1

Have a Json file that is arrays containing arrays I can get all "parts" with with code below but unable to figure out the json_normalize parms usage to extract different levels within nested arrays?

ie want 'id' from vehicle array with 'id' from model array with all parts array

car | camry | "value":"engine","price":10.82

Thanks

f = open('sample.json')
data = json.load(f)
f.close()
df1 = json_normalize(data['vehicle'], 'model')
df2 = df1[['parts']]
ddf = pd.DataFrame(columns=['value','charge'])

for (index,row) in df2.iterrows():
    if pd.notnull(row[0]):
        e = row[0]
        ddf.loc[index] = [e[0]['value'], e[0]['charge']]


{
"vehicle":[
{
 "id":"car",
 "model":[
{
  "id":"camry",
"parts": [
{
"value":"engine",
"charge":10.82
}   ]    }
,
{
  "id":"avelon",
"parts": [
{
"value":"seats",
"charge":538.26
}    ]    }
,
{
  "id":"prius",

"parts": [
{
"value":"seats",
"charge":10.91
}    ]    }
,
{
  "id":"corolla",
  "markup": {
  "value":"61"
}
,
  "accessories": [
{
  "value":"vvvvv"
  }]

}    ]    }    ]    }

1 Answer 1

1

I think you need:

#remove NaNs
s = df1['parts'].dropna()
#create new DataFrame, assuming only one list always
df2 = pd.DataFrame(s.str[0].values.tolist(), index=s.index)
print (df2)
   charge   value
0   10.82  engine
1  538.26   seats
2   10.91   seats

#join to original
df = df1[['id']].join(df2)
print (df)
        id  charge   value
0    camry   10.82  engine
1   avelon  538.26   seats
2    prius   10.91   seats
3  corolla     NaN     NaN
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks - followup - what would be the way to handle if "parts" array had multiple parts - ie camry has engine and seats ?
Do you think [ { "value":"engine", "charge":10.82 } ,{ "value":"engine1", "charge":9.43} ] ?
It depends what need - another pairs of columns? But then get NaNs rows for another rows if length is different of list of dicts. Or need something else?
Yes - I tried with [ { "value":"engine", "charge":10.82 } ,{ "value":"engine1", "charge":9.43} ] and only 1 is returned
Yes, I mentioned it in solution. So what do you need? new columns? What is desired ouptut if [ { "value":"engine", "charge":10.82 } ,{ "value":"engine1", "charge":9.43} ] ?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.