1

I have a a csv file "input.csv" which has the following data.

UID,BID,R
U1,B1,4
U1,B2,3
U2,B1,2

I want the above to look like the following dictionary; group by the UID as they key and BID and R as a nested dictionary value.

{"U1":{"B1":4, "B2": 3}, "U2":{"B1":2}}

I have the below code:

new_data_dict = defaultdict(str)
with open("input.csv", 'r') as data_file:
    data = csv.DictReader(data_file, delimiter=",")
    headers = next(data)
    for row in data:
        new_data_dict[row["UID"]] += {row["BID"]:int(row["R"])}

The above throws an obvious error of

TypeError: cannot concatenate 'str' and 'dict' objects

Is there a way to do this?

2 Answers 2

3

Using the regular dict() you can use get() to initialize a new sub-dict and fill it afterwards.

import csv

new_data_dict = {}
with open("data.csv", 'r') as data_file:
    data = csv.DictReader(data_file, delimiter=",")
    for row in data:
        item = new_data_dict.get(row["UID"], dict())
        item[row["BID"]] = int(row["R"])

        new_data_dict[row["UID"]] = item

print new_data_dict

Also, your call to next(data) was superfluous as the headers were automatically detected and stripped from the result.

Sign up to request clarification or add additional context in comments.

Comments

2

This is a more efficient version using defaultdict:

from collections import defaultdict

new_data_dict = {}
with open("input.csv", 'r') as data_file:
    data_file.next()
    for row in data_file:
        row = row.strip().split(",")
        new_data_dict.setdefault(row[0],{})[row[1]] = int(row[2])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.