python write dictionary from csv file

Question

I am now struggling with writing a dictionary from a csv file.

The format of csv file is like:

student,    Test 1, Test 2, Test 3, Final Exam
A,          9,      19,    9,       22
B,          10,     16,    9,       26
C,          11,     17,    8,       27
D,          7,      14,    9,       18
E,          8,      20,    8,       23
weight,     0.15,   0.25,  0.2,     0.4
max_points  12      20     9        30

Where the 2-6 rows are students' names, their test scores on each test. And the last two rows represent weight of each test and full score of each test seperately.

Now, I want to create a dictionary from this list that looks like:

{'Test 1': {'weight': '0.15', 'max_points': '12'}, 
'Test 2': {'weight': '0.25', 'max_points': '20'}, 
'Test 3': {'weight': '0.2',   'max_points': '9'}, 
'Final Exam': {'weight': '0.4', 'max_points': '30'}}

Where the keys are the variables of the first row in the csv file except the variable students; and in each nested dictionary, keys are the names of the first column and last two rows in the csv file: weight, max_points. The corresponding values are just values in their rows respectively.

The only thing I have come up with by now is:

reader = csv.DictReader(open('gradebook.csv'))
for row in reader:
    key = row.pop('Student')

And I have no idea about how to proceed. Thank you so much for help!!!

Your file, as shown, is not a CSV file. Columns in a CSV file are separated by commas. — DYZ
– DYZ, Commented Jan 26, 2017 at 3:15
@DYZ technically true, but now it's become common practice to call all delimited text files as CSV. not sayint it's right or wrong, just a saying it's common. — e4c5
– e4c5, Commented Jan 26, 2017 at 3:16
@e4c5 It has become common practice to call digits "numbers". That doesn't make them numbers, does it? It never hurts to go back to basics. — DYZ
– DYZ, Commented Jan 26, 2017 at 3:21
@DYZ I am sorry for the confusion. I just omitted commas, but did you get what I mean? — Parker
– Parker, Commented Jan 26, 2017 at 3:24
@dyz note that in pandas, the function to read files like the above is named read_csv — e4c5
– e4c5, Commented Jan 26, 2017 at 3:26

e4c5 · Accepted Answer · 2017-01-26 03:45:04Z

3

Use Pandas, it's a one liner

import pandas as pd

df = pandas.read_csv('myfile.csv', delim_whitespace=True)
{ k: { 'max_points': df[k].max(), 'weight': df[k][5] } for k in df.keys()[1:] }

Edit. Opps, I see taht the OP isn't actually look for max()

{ k: { 'max_points': df[k][6], 'weight': df[k][5] } for k in df.keys()[1:] }

By the way if Pandas doesn't recognize your headers properly

df = pd.read_csv('/tmp/df.txt',delim_whitespace=True, header=1, names=['Student','Test 1','Test 2','Test 3','Final Score'])

edited Jan 26, 2017 at 3:45

answered Jan 26, 2017 at 3:26

e4c5

53.9k11 gold badges110 silver badges139 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Parker Over a year ago

Thanks. But can i do this without using pandas?

e4c5 Over a year ago

of course you can, with the greatest difficulty :-)

Josh Smeaton Over a year ago

This is a cool solution, I need to familiarise myself with pandas more. I tried min() for the weight, but the values from the csv aren't great as binary numbers :)

e4c5 Over a year ago

@JoshSmeaton I am also still learning! your answe is a pretty good one as well. +1

Josh Smeaton · Accepted Answer · 2017-01-26 03:37:33Z

2

Here's a solution not using pandas that should do what you want. Note though that my csv file is an actual csv file, so you may need to adjust the reader creation accordingly.

In [13]: reader = csv.DictReader(open('tests.csv'))

In [14]: record = defaultdict(dict)

In [15]: for row in reader:
    ...:    if row['Student'] == 'weight':
    ...:        for header in reader.fieldnames[1:]:
    ...:            record[header]['weight'] = row[header]
    ...:    if row['Student'] == 'max_points':
    ...:        for header in reader.fieldnames[1:]:
    ...:            record[header]['max_points'] = row[header]


In [17]: from pprint import pprint

In [18]: pprint(record)
defaultdict(<class 'dict'>,
            {'Final Exam': {'max_points': '30', 'weight': '0.4'},
             'Test 1': {'max_points': '12', 'weight': '0.15'},
             'Test 2': {'max_points': '20', 'weight': '0.25'},
             'Test 3': {'max_points': '9', 'weight': '0.2'}})

If you haven't seen defaultdict before, whatever you pass to the constructor is what is used as the value when you try to access a key that doesn't yet exist.

edited Jan 26, 2017 at 3:37

answered Jan 26, 2017 at 3:26

Josh Smeaton

48.8k24 gold badges137 silver badges165 bronze badges

5 Comments

Parker Over a year ago

thanks. But can i do this without accessing those keys. Like can I do this without doing record['test 1']['weight'] = row['test 1']?

e4c5 Over a year ago

If you don't want to use pandas you wll have to accept this answer

Parker Over a year ago

@JoshSmeaton Thanks. But it does not work for me. I got this result if I do not use pprint: defaultdict(<class 'dict'>, {})

Parker Over a year ago

@JoshSmeaton When I use pprint, I still get an empty dictionary. Do you know how to fix it?

Parker Over a year ago

I now get where the empty dict comes from. I used row.pop('Student') in previous lines. All fixed! Thanks so much!

Collectives™ on Stack Overflow

python write dictionary from csv file

2 Answers 2

4 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related