Create json file based on values in one column

Question

I have a data frame with multiple columns that I would like to convert to a .json file. The structure of the .json file should be as such: I want to use one column as an 'identifier' column, where the values serve as keys for a dictionary. All values in this column are unique. All other columns should be represented as key-value-mappings for each unique value of the identifier column in the order of appearance. I am also looking for a function to reproduce the data frame based on this .json file. Here's an example code that produces a dummy data frame:

import numpy as np
import pandas as pd

data_dictionary = {'col_1':[np.nan,np.nan,np.nan,np.nan],
                   'col_2':[np.nan,1,np.nan,1],
                   'col_3':['a','b','c','d'],
                   'col_4':['description of a','description of b','description of c','description of d']}

df = pd.DataFrame(data_dictionary)

which gives:

   col_1  col_2 col_3             col_4
0    NaN    NaN     a  description of a
1    NaN    1.0     b  description of b
2    NaN    NaN     c  description of c
3    NaN    1.0     d  description of d

And this is how the .json file should look like (using col_3 as identifier column):

{
  "col_3": {
    "a": {
      "col_1": null,
      "col_2": null,
      "col_4": "description of a"
    },
    "b": {
      "col_1": null,
      "col_2": 1,
      "col_4": "description of b"
    },
    "c": {
      "col_1": null,
      "col_2": null,
      "col_4": "description of c"
    },
    "d": {
      "col_1": null,
      "col_2": 1,
      "col_4": "description of d"
    }
  }
}

df.set_index('col_3').to_json(orient='index') almost solve your problem. — Quang Hoang
– Quang Hoang, Commented Nov 18, 2020 at 15:24

adir abargil · Accepted Answer · 2020-11-18 15:31:51Z

1

let me try something:

import json
dict_result = df.set_index('col_3').to_json(orient='index')
final = {'col_3':json.loads(dict_result)}
print(final)

>>>{'col_3': 
     {'a': 
        {
         'col_1': None,
         'col_2': None,
         'col_4': 'description of a'
        }, 
      'b': 
        {
         'col_1': None, 
         'col_2': 1.0, 
         'col_4': 'description of b'
        }, 
      'c': 
        { 
         'col_1': None, 
         'col_2': None,
         'col_4': 'description of c'
        }, 
     'd': 
        {
         'col_1': None,
         'col_2': 1.0,
         'col_4': 'description of d'
 }}}

edited Nov 18, 2020 at 15:31

answered Nov 18, 2020 at 15:26

adir abargil

5,7453 gold badges23 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Johannes Wiesner Over a year ago

That's pretty close. The only thing I am missing when I save the .json file to disk is the 'col_3' key? So df.set_index('col_3').to_json('test.json',orient='index') gives me a single dict, with 'a','b','c','d' as keys.

adir abargil Over a year ago

you skipped this final = {'col_3':json.loads(dict_result)} ...

Johannes Wiesner Over a year ago

Ah, the only thing that was missing to also save the .json file, was with open('data.json', 'w') as fp: json.dump(final, fp)

adir abargil Over a year ago

Ok, happy you found a way!

Collectives™ on Stack Overflow

Create json file based on values in one column

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related