Python CSV to JSON W/ Array Output

Question

I'm trying to take data from a CSV and put it in a top-level array in JSON format.

Currently I am running this code:

import csv
import json

csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("ID","Artist","Song", "Artist")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

The CSV file is formatted as so:

| 1 | Empire of the Sun | We Are The People | Walking on a Dream |
| 2 | M83 | Steve McQueen | Hurry Up We're Dreaming |

Where = Column 1: ID | Column 2: Artist | Column 3: Song | Column 4: Album

And getting this output:

    {"Song": "Empire of the Sun", "ID": "1", "Artist": "Walking on a   Dream"}
    {"Song": "M83", "ID": "2", "Artist": "Hurry Up We're Dreaming"}

I'm trying to get it to look like this though:

{             
    "Music": [

    {
        "id": 1,
        "Artist": "Empire of the Sun",
        "Name": "We are the People",
        "Album": "Walking on a Dream"
    },
    {
        "id": 2,
        "Artist": "M83",
        "Name": "Steve McQueen",
        "Album": "Hurry Up We're Dreaming"
    },
    ]
}

Just for Question 1, use the following snippet for your DictReader setup: import collections ; reader = DictReader(csvfile, fieldnames, dict_class=collections.OrderedDict) — Michel Müller
– Michel Müller, Commented Jan 5, 2017 at 8:21
2 & 3 are unclear. Please specify expected output if you want others to help. — Michel Müller
– Michel Müller, Commented Jan 5, 2017 at 8:23
the import goes to your imports at the beginning and the reader = line is a drop-in replacement to your DictReader initialisation. — Michel Müller
– Michel Müller, Commented Jan 5, 2017 at 8:26
when I put "import collections ;" on the top, and add the "reader = DictReader(csv..)" it says reader = DictReader(csvfile, dict_class=collections.OrderedDict) NameError: name 'DictReader' is not defined — orpheus
– orpheus, Commented Jan 5, 2017 at 8:35

chthonicdaemon · Accepted Answer · 2017-01-05 09:25:27Z

Pandas solves this really simply. First to read the file

import pandas

df = pandas.read_csv('music.csv', names=("id","Artist","Song", "Album"))

Now you have some options. The quickest way to get a proper json file out of this is simply

df.to_json('file.json', orient='records')

Output:

[{"id":1,"Artist":"Empire of the Sun","Song":"We Are The People","Album":"Walking on a Dream"},{"id":2,"Artist":"M83","Song":"Steve McQueen","Album":"Hurry Up We're Dreaming"}]

This doesn't handle the requirement that you want it all in a "Music" object or the order of the fields, but it does have the benefit of brevity.

To wrap the output in a Music object, we can use to_dict:

import json
with open('file.json', 'w') as f:
    json.dump({'Music': df.to_dict(orient='records')}, f, indent=4)

Output:

{
    "Music": [
        {
            "id": 1,
            "Album": "Walking on a Dream",
            "Artist": "Empire of the Sun",
            "Song": "We Are The People"
        },
        {
            "id": 2,
            "Album": "Hurry Up We're Dreaming",
            "Artist": "M83",
            "Song": "Steve McQueen"
        }
    ]
}

I would advise you to reconsider insisting on a particular order for the fields since the JSON specification clearly states "An object is an unordered set of name/value pairs" (emphasis mine).

Michel Müller · Accepted Answer · 2017-01-05 09:06:14Z

3

Alright this is untested, but try the following:

import csv
import json
from collections import OrderedDict

fieldnames = ("ID","Artist","Song", "Artist")

entries = []
#the with statement is better since it handles closing your file properly after usage.
with open('music.csv', 'r') as csvfile:
    #python's standard dict is not guaranteeing any order, 
    #but if you write into an OrderedDict, order of write operations will be kept in output.
    reader = csv.DictReader(csvfile, fieldnames)
    for row in reader:
        entry = OrderedDict()
        for field in fieldnames:
            entry[field] = row[field]
        entries.append(entry)

output = {
    "Music": entries
}

with open('file.json', 'w') as jsonfile:
    json.dump(output, jsonfile)
    jsonfile.write('\n')

edited Jan 5, 2017 at 9:06

answered Jan 5, 2017 at 8:41

Michel Müller

5,7353 gold badges35 silver badges52 bronze badges

3 Comments

orpheus Over a year ago

Traceback (most recent call last): File "spotPy.py", line 9, in <module> reader = csv.DictReader( csvfile, fieldnames, dict_class=collections.OrderedDict) File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 79, in init self.reader = reader(f, dialect, *args, **kwds) TypeError: 'dict_class' is an invalid keyword argument for this function

Michel Müller Over a year ago

oops. sorry, that was some non standard DictReader, one moment.

chthonicdaemon Over a year ago

Note the error in OP's code where they had fieldnames = ("ID","Artist","Song", "Artist"). This should be fieldnames = ("ID","Artist","Song", "Album").

jpmc26 · Accepted Answer · 2017-01-05 09:00:57Z

Your logic is in the wrong order. json is designed to convert a single object into JSON, recursively. So you should always be thinking in terms of building up a single object before calling dump or dumps.

First collect it into an array:

music = [r for r in reader]

Then put it in a dict:

result = {'Music': music}

Then dump to JSON:

json.dump(result, jsonfile)

Or all in one line:

json.dump({'Music': [r for r in reader]}, jsonfile)

"Ordered" JSON

If you really care about the order of object properties in the JSON (even though you shouldn't), you shouldn't use the DictReader. Instead, use the regular reader and create OrderedDicts yourself:

from collections import OrderedDict

...

reader = csv.Reader(csvfile)
music = [OrderedDict(zip(fieldnames, r)) for r in reader]

Or in a single line again:

json.dump({'Music': [OrderedDict(zip(fieldnames, r)) for r in reader]}, jsonfile)

Other

Also, use context managers for your files to ensure they're closed properly:

with open('music.csv', 'r') as csvfile, open('file.json', 'w') as jsonfile:
    # Rest of your code inside this block

Robᵩ · Accepted Answer · 2017-01-05 09:13:40Z

0

It didn't write to the JSON file in the order I would have liked

The csv.DictReader classes return Python dict objects. Python dictionaries are unordered collections. You have no control over their presentation order.

Python does provide an OrderedDict, which you can use if you avoid using csv.DictReader().

and it skipped the song name altogether.

This is because the file is not really a CSV file. In particular, each line begins and ends with the field separator. We can use .strip("|") to fix this.

I need all this data to be output into an array named "Music"

Then the program needs to create a dict with "Music" as a key.

I need it to have commas after each artist info. In the output I get I get

This problem is because you call json.dumps() multiple times. You should only call it once if you want a valid JSON file.

Try this:

import csv
import json
from collections import OrderedDict


def MyDictReader(fp, fieldnames):
    fp = (x.strip().strip('|').strip() for x in fp)
    reader = csv.reader(fp, delimiter="|")
    reader = ([field.strip() for field in row] for row in reader)
    dict_reader = (OrderedDict(zip(fieldnames, row)) for row in reader)
    return dict_reader

csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')
fieldnames = ("ID","Artist","Song", "Album")
reader = MyDictReader(csvfile, fieldnames)
json.dump({"Music": list(reader)}, jsonfile, indent=2)

edited Jan 5, 2017 at 9:13

answered Jan 5, 2017 at 8:54

Robᵩ

170k20 gold badges251 silver badges323 bronze badges

14 Comments

Michel Müller Over a year ago

You yield a standard dict. doesn't that throw away the order again?

Robᵩ Over a year ago

No, the order is already destroyed. Each row is already a standard dict.

Michel Müller Over a year ago

Yep, but see my answer - you can force it into correct order by putting the entries into an OrderedDict in order of fieldnames.

orpheus Over a year ago

Traceback (most recent call last): File "spotPy.py", line 15, in <module> json.dump({"Music": list(reader)}, jsonfile, indent=2) File "spotPy.py", line 9, in MyDictReader yield {k:v.strip() for k,v in row.items()} File "spotPy.py", line 9, in <dictcomp> yield {k:v.strip() for k,v in row.items()} AttributeError: 'NoneType' object has no attribute 'strip'

Robᵩ Over a year ago

@AlwaysSunny - jpmc's point is that Stack Overflow isn't intended or designed to be a code-writing service. It is intended to be a repository of useful information. Any question that is simply "please write my code for me" has no value for future generations and, in fact, makes it harder for people to find genuine questions and answers. If your goal really is to have someone write your code, there are other websites designed and intended for that purpose.

|

Collectives™ on Stack Overflow

Python CSV to JSON W/ Array Output

4 Answers 4

Comments

3 Comments

"Ordered" JSON

Other

Comments

14 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

3 Comments

"Ordered" JSON

Other

Comments

14 Comments

Your Answer

Sign up or log in

Post as a guest

Related