I have a data frame with multiple columns that I would like to convert to a .json file. The structure of the .json file should be as such: I want to use one column as an 'identifier' column, where the values serve as keys for a dictionary. All values in this column are unique. All other columns should be represented as key-value-mappings for each unique value of the identifier column in the order of appearance. I am also looking for a function to reproduce the data frame based on this .json file. Here's an example code that produces a dummy data frame:
import numpy as np
import pandas as pd
data_dictionary = {'col_1':[np.nan,np.nan,np.nan,np.nan],
'col_2':[np.nan,1,np.nan,1],
'col_3':['a','b','c','d'],
'col_4':['description of a','description of b','description of c','description of d']}
df = pd.DataFrame(data_dictionary)
which gives:
col_1 col_2 col_3 col_4
0 NaN NaN a description of a
1 NaN 1.0 b description of b
2 NaN NaN c description of c
3 NaN 1.0 d description of d
And this is how the .json file should look like (using col_3 as identifier column):
{
"col_3": {
"a": {
"col_1": null,
"col_2": null,
"col_4": "description of a"
},
"b": {
"col_1": null,
"col_2": 1,
"col_4": "description of b"
},
"c": {
"col_1": null,
"col_2": null,
"col_4": "description of c"
},
"d": {
"col_1": null,
"col_2": 1,
"col_4": "description of d"
}
}
}
df.set_index('col_3').to_json(orient='index')almost solve your problem.