I want to export a Pandas df to a nested JSON for ingestion in Mongodb.
Here's an example of the data:
data = {
'product_id': ['a001','a001','a001'],
'product': ['aluminium','aluminium','aluminium'],
'production_id': ['b001','b002','b002'],
'production_name': ['metallurgical','recycle','recycle'],
'geo_name': ['US','EU','RoW'],
'value': [100, 200 ,200]
}
df = pd.DataFrame(data=data)
| product_id | product | production_id | production_name | geo_name | value |
|---|---|---|---|---|---|
| a001 | aluminium | b001 | metallurgical | US | 100 |
| a001 | aluminium | b002 | recycle | EU | 200 |
| a001 | aluminium | b002 | recycle | RoW | 200 |
and this is what the final JSON should look like:
{
"name_id": "a001",
"name": "aluminium",
"activities": [
{
"product_id": "b001"
"product_name": "metallurgical",
"regions": [
{
"geo_name": "US",
"value": 100
}
]
},
{
"product_id": "b002"
"product_name": "recycle",
"regions": [
{
"geo_name": "EU",
"value": 200
},
{
"geo_name": "RoW",
"value": 200
}
]
}
]
}
There are some questions that are close to my problem but they are either years old, and refer to an older version of Pandas for which the solutions break, or do not fully work the way I would like the json to be grouped and nested (this for example is single level How to create a nested JSON from pandas DataFrame?).
Some help would be really appreciated.