Converting varying JSON objects to CSV-file

Question

I am calling an API and it returns JSON string. I want to convert it to CSV-format so I can save it later to database. However, JSON objects keys cause problems because there is keys missing or keys are changing. I wrote this python script but because of keys I cannot get it to work:

import json
import csv

with open('custom.json') as json_file:
    data = json.load(json_file)

custom_data = data['CustomJSON']
data_file = open('data_file.csv', 'w')
csv_writer = csv.writer(data_file)
count = 0

for i in custom_data:
    if count == 0:
        # Writing headers of CSV file
        header = i.keys()
        csv_writer.writerow(header)
        count += 1
    # Writing data of CSV file
    csv_writer.writerow(i.values())
data_file.close()

How I can convert this type of JSON to CSV? Example JSON message:

{
    "CustomJSON" : [
    {
     "id" : "1,
      "name" : "Jack",
      "surname" : "Bauer"
    },
    {
      "id" : "2",
      "name" : "John",
      "surname" : "Smith"
      "age" : "31",
      "city" : "New York"
    },
    {
      "id" : "3",
      "name" : "Matt",
      "surname" : "Secret"
      "exception_1" : "Exception_1",
      "exception_2" : "Exception_2"
      "date" : "2022-02-08"
    }
  ]
}

Should I try to loop all key-values first somehow and then later try to add data? Can anyone provide an example?

I tried loading your json, and it's not valid. If you provide actual, pasted code, we can help you much more quickly and easily. — pletnes
– pletnes, Commented Feb 8, 2022 at 8:52

Serge Ballesta · Accepted Answer · 2022-02-08 09:10:22Z

1

As you are reading a single JSON string, you will have everything in memory. So IMHO the simplest way is to first build the field names list, and then write everything to a csv file.

# compute the fieldnamelist
# this uses a dict because it is easy to update it while maintaining key order
keys = dict()
for d in data['CustomJSON']:
    keys.update(d)

# write to the csv file
# this uses a DictWriter because the individual rows are already dicts
with open('data_file.csv', 'w', newline='') as data_file
    csv_writer = csv.DictWriter(data_file, fieldnames = keys.keys())
    _ = csv_writer.writeheader()
    _ = csv_writer.writerows(data['CustomJSON'])

With your data it gives as expected:

id,name,surname,age,city,exception_1,exception_2,date
1,Jack,Bauer,,,,,
2,John,Smith,31,New York,,,
3,Matt,Secret,,,Exception_1,Exception_2,2022-02-08

answered Feb 8, 2022 at 9:10

Serge Ballesta

150k13 gold badges137 silver badges267 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

martineau Over a year ago

If the OP wants a fulsome header row in the output file, then it's not merely your opinion that the data will need to be preprocessed first to determine what they are — it's a certitude. The amount of data needed to store the keys dictionary could be minimized by using keys.update(dict.fromkeys(d.keys())) which would strip the values off.

Serge Ballesta Over a year ago

@martineau: If rows were expected to come from a large large number of big files, I would have suggested to directly use a database, to be able to add new columns after reading a new file. The sqlite module would make it easy...

martineau Over a year ago

Putting the data in a database first would also be a form of preprocessing, would it not? That's the point I was trying to make.

pletnes · Accepted Answer · 2022-02-08 08:45:33Z

0

I am a pandas fan(atic) so I'd do something like

import pandas as pd

# df is a pandas dataframe)
df = pd.read_json('http://data.com/foo')
df.to_csv('foo.csv')

Pandas has options for the CSV dialect, if need be. You should be able to do what you describe with those two function calls, though.

answered Feb 8, 2022 at 8:45

pletnes

4795 silver badges18 bronze badges

1 Comment

martineau Over a year ago

Suggesting that someone download and install a large module, and then learn how to use it in order to solve this relatively trivial task is not a good suggestion IMO.

Collectives™ on Stack Overflow

Converting varying JSON objects to CSV-file

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related