Any help is really appreciated,
I have the below JSON, provided by API call. I've omitted sensitive data, but the key names are as presented ("value", "value_raw").
[{
"Position": "1234",
"StartDate": "2020-11-21",
"ID": "1234",
"CloseDate": "2020-12-07",
"Title": "This is a title",
"data": [{
"value": 1234
},
{
"value": "some text"
},
{
"value": "some text",
"value_raw": 11111
},
{
"value_raw": 11111,
"value": "some text"
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 22222
},
{
"value_raw": 2222222,
"value": "some text"
},
{
"value_raw": "null",
"value": "null"
},
{
"value_raw": "null",
"value": "null"
},
{
"value_raw": 2222222,
"value": "some text"
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 2222
},
{
"value": 1
},
{
"value": "some text",
"value_raw": 22222
}
]
}, {
"Position": "1235",
"StartDate": "2020-12-21",
"ID": "1235",
"CloseDate": "2021-01-12",
"Title": "some text",
"data": [{
"value": 1235
},
{
"value": "some text"
},
{
"value": "some text",
"value_raw": 1111
},
{
"value": "some text",
"value_raw": 1111
},
{
"value": "null"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value_raw": 11111,
"value": "some text"
},
{
"value_raw": "null",
"value": "null"
},
{
"value": "some text",
"value_raw": 1111
},
{
"value_raw": "null",
"value": "null"
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 22222
},
{
"value": 1
},
{
"value_raw": 22222,
"value": "some text"
}
]
}, {
"ID": "1236",
"Position": "1236",
"StartDate": "2021-07-12",
"data": [{
"value": 1236
},
{
"value": "some text"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value": "some text",
"value_raw": 1111
},
{
"value": "null"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value_raw": 1111,
"value": "some text"
},
{
"value_raw": "null",
"value": "null"
},
{
"value": "null",
"value_raw": "null"
},
{
"value": "some text",
"value_raw": 111
},
{
"value": "null"
},
{
"value": "some text",
"value_raw": 12223
},
{
"value": 1
},
{
"value": "some text",
"value_raw": 2222
}
],
"Title": "some text",
"CloseDate": "2021-07-23"
}
]
When I normalize "data" using;
df = pd.json_normalize(mydata, record_path=['data'])
I end up with an output of 2 columns x 42 rows (excl. headings), illustration:
| value | value_raw |
|---|---|
| 1234 | |
| This is a title | |
| some text | 11111 |
| Corporation | 11111 |
| null | |
| some text | 22222 |
| some text | 2222222 |
| null | null |
| null | null |
The only data im interested in is the key "value", I'd also like to know how to lay this data out as 3 rows x 14 columns (one row for each ID = '1234', '1235' & '1236' and no column headings needed as they provide zero benefit with the naming convention "value")
Any starting point would be great, I have spent hours looking at previous questions. What I have noticed is that the JSON I receive is very different to all of the examples out there.
Thanks everyone