Convert dataframe to JSON using Python

Question

I have been trying to convert a dataframe to JSON using Python. I am able to do it successfully but i am not getting the required format of JSON.

Code -

df1 = df.rename_axis('CUST_ID').reset_index()
df.to_json('abc.json')

Here, abc.json is the filename of JSON and df is the required dataframe.

What I am getting -

{"CUST_LAST_UPDATED": 
{"1000":1556879045879.0,"1001":1556879052416.0},
"CUST_NAME":{"1000":"newly 
updated_3_file","1001":"heeloo1"}}

What I want -

[{"CUST_ID":1000,"CUST_NAME":"newly 
updated_3_file","CUST_LAST_UPDATED":1556879045879},
{"CUST_ID":1001,"CUST_NAME":"heeloo1","CUST_LAST_UPDATED":1556879052416}]

Error -

Traceback (most recent call last):
File 
"C:/Users/T/PycharmProject/test_pandas.py", 
line 19, in <module>
df1 = df.rename_axis('CUST_ID').reset_index()
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site- 
packages\pandas\core\frame.py", line 3379, in reset_index
new_obj.insert(0, name, level_values)
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site- 
packages\pandas\core\frame.py", line 2613, in insert
allow_duplicates=allow_duplicates)
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site- 
packages\pandas\core\internals.py", line 4063, in insert
raise ValueError('cannot insert {}, already exists'.format(item))
ValueError: cannot insert CUST_ID, already exists

df.head() Output -

    CUST_ID  CUST_LAST_UPDATED              CUST_NAME
0     1000      1556879045879     newly updated_3_file
1     1001      1556879052416                  heeloo1

How to change the format while converting dataframe to JSON?

jezrael · Accepted Answer · 2019-05-09 12:54:50Z

3

Use DataFrame.rename_axis with DataFrame.reset_index for column from index and then DataFrame.to_json with orient='records':

df1 = df.rename_axis('CUST_ID').reset_index()
df1.to_json('abc.json', orient='records')

[{"CUST_ID":"1000",
  "CUST_LAST_UPDATED":1556879045879.0,
  "CUST_NAME":"newly updated_3_file"},
 {"CUST_ID":"1001",
  "CUST_LAST_UPDATED":1556879052416.0,
  "CUST_NAME":"heeloo1"}]

EDIT:

Because there is default index in data, use:

df1.to_json('abc.json', orient='records')

Verify:

print (df1.to_json(orient='records'))
[{"CUST_ID":1000,
  "CUST_LAST_UPDATED":1556879045879,
  "CUST_NAME":"newly pdated_3_file"},
 {"CUST_ID":1001,
  "CUST_LAST_UPDATED":1556879052416,
  "CUST_NAME":"heeloo1"}]

edited May 9, 2019 at 12:54

answered May 8, 2019 at 11:53

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

TeeKay Over a year ago

It is giving an error. Please Check post for the error

TeeKay Over a year ago

I cannot rename the column, neither can I drop it and recreate because the value in the CUST_ID column will change. But if I try just the first approach, it is still giving the error.

TeeKay Over a year ago

No, I do not want any duplicated column. I want only three columns. Like the one I mentioned in the question.

TeeKay Over a year ago

df.head() returns top n (5 by default) rows of a data frame.

TeeKay Over a year ago

CUST_ID CUST_LAST_UPDATED CUST_NAME 0 1000 1556879045879 newly updated_3_file 1 1001 1556879052416 heeloo1

|

Mohammad · Accepted Answer · 2019-11-19 14:57:30Z

0

You can convert a dataframe to a jason format using to_dict:

df1.to_dict('records')

the outpit would the one that you need.

answered Nov 19, 2019 at 14:57

Mohammad

1,0552 gold badges15 silver badges30 bronze badges

Comments

mannem srinivas · Accepted Answer · 2021-08-05 11:06:22Z

0

Suppose if dataframe has nan values in each row and you don't want them in your json file. Follow below code

import pandas as pd
from pprint import pprint
import json
import argparse



if __name__=="__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--csv")
    parser.add_argument("--json")
    args = parser.parse_args()


    entities=pd.read_csv(args.csv)

    json_data=[row.dropna().to_dict() for index,row in entities.iterrows()]
    with open(args.json,"w") as file:
        json.dump(json_data,file)

answered Aug 5, 2021 at 11:06

mannem srinivas

1212 silver badges6 bronze badges

Collectives™ on Stack Overflow

Convert dataframe to JSON using Python

3 Answers 3

8 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

8 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related