0

I want to convert CSV to JSON in python. I was able to convert simple csv files to json, but not able to join two csv into one nested json.

emp.csv:
empid | empname | empemail
e123 | adam | [email protected]
e124 | steve | [email protected]
e125 | brian | [email protected]
e126 | mark | [email protected]

items.csv:
empid | itemid | itemname | itemqty
e123 | itm128 | glass | 25
e124 | itm130 | bowl | 15
e123 | itm116 | book | 50
e126 | itm118 | plate | 10
e126 | itm128 | glass | 15
e125 | itm132 | pen | 10

the output should be like:

    [{
    "empid": "e123",
    "empname": "adam",
    "empemail": "[email protected]",
    "items": [{
        "itemid": "itm128",
        "itmname": "glass",
        "itemqty": 25
    }, {
        "itemid": "itm116",
        "itmname": "book",
        "itemqty": 50
    }]
},
and similar for others]

the code that i have written:

import csv
import json

empcsvfile = open('emp.csv', 'r')
jsonfile = open('datamodel.json', 'w')

itemcsvfile = open('items.csv', 'r')

empfieldnames = ("empid","name","phone","email")
itemfieldnames = ("empid","itemid","itemname","itemdesc","itemqty")

empreader = csv.DictReader( empcsvfile, empfieldnames)
itemreader = csv.DictReader( itemcsvfile, itemfieldnames)

output=[];
empcount=0
for emprow in empreader:
    output.append(emprow)   
    for itemrow in itemreader:
        if(itemrow["empid"]==emprow["empid"]):
            output.append(itemrow)
    empcount = empcount +1
print output
json.dump(output, jsonfile,sort_keys=True)

and it doesnot work. Help needed. Thanks

4
  • Are the csv files - comma separated or pipe separated ? Commented Mar 24, 2017 at 3:24
  • I ran your code and put up a few print statements and this is what is the output - {'empid': 'empid | empname | empemail', 'phone': None, 'name': None, 'email': None} {'empid': 'e123 | adam | [email protected]', 'phone': None, 'name': None, 'email': None} {'empid': 'e124 | steve | [email protected]', 'phone': None, 'name': None, 'email': None} {'empid': 'e125 | brian | [email protected]', 'phone': None, 'name': None, 'email': None} {'empid': 'e126 | mark | [email protected]', 'phone': None, 'name': None, 'email': None} So, as you see since it's pipe separated , the entire row is considered as empid Commented Mar 24, 2017 at 3:26
  • Thanks for the question. Commented Mar 24, 2017 at 4:20
  • I would like to clairify that it is comma separated file but for the purpose of posting as table in stack overflow, i used pipe, so that it looks like table. Apologies for inconvience Commented Mar 24, 2017 at 4:21

2 Answers 2

0

Okay, so you have a few problems. The first is that you need to specify the delimiter for your CSV file. You're using the | character and by default python is probably going to expect ,. So you need to do this:

empreader = csv.DictReader( empcsvfile, empfieldnames, delimiter='|')

Second, you aren't appending the items to the employee dictionary. You probably should create a key called 'items' on each employee dictionary object and append the items to that list. Like this:

for emprow in empreader:
    emprow['items'] = [] # create a list to hold items for this employee
    ...
    for itemrow in itemreader:
        ...
        emprow['items'].append(itemrow) # append an item for this employee

Third, each time you loop through an employee, you need to go back to the top of the item csv file. You have to realize that once python reads to the bottom of a file it won't just go back to the top of it on the next loop. You have to tell it to do that. Right now, your code reads through the item.csv file after the first employee is processed then stays there at the bottom of the file for all the other employees. You have to use seek(0) to tell it to go back to the top of the file for each employee.

for emprow in empreader:
    emprow['items'] = []
    output.append(emprow)
    itemcsvfile.seek(0)
    for itemrow in itemreader:
        if(itemrow["empid"]==emprow["empid"]):
            emprow['items'].append(itemrow)
Sign up to request clarification or add additional context in comments.

1 Comment

Worked like charm. Thank you.
0

Columns are not matching:

empid | empname | empemail

empfieldnames = ("empid","name","phone","email")

empid | itemid | itemname | itemqty

itemfieldnames = ("empid","itemid","itemname","itemdesc","itemqty")

We use , usually instead of | in CSV

What's more, you need to replace ' into " in JSON

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.