0

I am now struggling with writing a dictionary from a csv file.

The format of csv file is like:

student,    Test 1, Test 2, Test 3, Final Exam
A,          9,      19,    9,       22
B,          10,     16,    9,       26
C,          11,     17,    8,       27
D,          7,      14,    9,       18
E,          8,      20,    8,       23
weight,     0.15,   0.25,  0.2,     0.4
max_points  12      20     9        30

Where the 2-6 rows are students' names, their test scores on each test. And the last two rows represent weight of each test and full score of each test seperately.

Now, I want to create a dictionary from this list that looks like:

{'Test 1': {'weight': '0.15', 'max_points': '12'}, 
'Test 2': {'weight': '0.25', 'max_points': '20'}, 
'Test 3': {'weight': '0.2',   'max_points': '9'}, 
'Final Exam': {'weight': '0.4', 'max_points': '30'}}

Where the keys are the variables of the first row in the csv file except the variable students; and in each nested dictionary, keys are the names of the first column and last two rows in the csv file: weight, max_points. The corresponding values are just values in their rows respectively.

The only thing I have come up with by now is:

reader = csv.DictReader(open('gradebook.csv'))
for row in reader:
    key = row.pop('Student')

And I have no idea about how to proceed. Thank you so much for help!!!

14
  • Your file, as shown, is not a CSV file. Columns in a CSV file are separated by commas. Commented Jan 26, 2017 at 3:15
  • @DYZ technically true, but now it's become common practice to call all delimited text files as CSV. not sayint it's right or wrong, just a saying it's common. Commented Jan 26, 2017 at 3:16
  • @e4c5 It has become common practice to call digits "numbers". That doesn't make them numbers, does it? It never hurts to go back to basics. Commented Jan 26, 2017 at 3:21
  • @DYZ I am sorry for the confusion. I just omitted commas, but did you get what I mean? Commented Jan 26, 2017 at 3:24
  • @dyz note that in pandas, the function to read files like the above is named read_csv Commented Jan 26, 2017 at 3:26

2 Answers 2

3

Use Pandas, it's a one liner

import pandas as pd

df = pandas.read_csv('myfile.csv', delim_whitespace=True)
{ k: { 'max_points': df[k].max(), 'weight': df[k][5] } for k in df.keys()[1:] }

Edit. Opps, I see taht the OP isn't actually look for max()

{ k: { 'max_points': df[k][6], 'weight': df[k][5] } for k in df.keys()[1:] }

By the way if Pandas doesn't recognize your headers properly

df = pd.read_csv('/tmp/df.txt',delim_whitespace=True, header=1, names=['Student','Test 1','Test 2','Test 3','Final Score'])
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. But can i do this without using pandas?
of course you can, with the greatest difficulty :-)
This is a cool solution, I need to familiarise myself with pandas more. I tried min() for the weight, but the values from the csv aren't great as binary numbers :)
@JoshSmeaton I am also still learning! your answe is a pretty good one as well. +1
2

Here's a solution not using pandas that should do what you want. Note though that my csv file is an actual csv file, so you may need to adjust the reader creation accordingly.

In [13]: reader = csv.DictReader(open('tests.csv'))

In [14]: record = defaultdict(dict)

In [15]: for row in reader:
    ...:    if row['Student'] == 'weight':
    ...:        for header in reader.fieldnames[1:]:
    ...:            record[header]['weight'] = row[header]
    ...:    if row['Student'] == 'max_points':
    ...:        for header in reader.fieldnames[1:]:
    ...:            record[header]['max_points'] = row[header]


In [17]: from pprint import pprint

In [18]: pprint(record)
defaultdict(<class 'dict'>,
            {'Final Exam': {'max_points': '30', 'weight': '0.4'},
             'Test 1': {'max_points': '12', 'weight': '0.15'},
             'Test 2': {'max_points': '20', 'weight': '0.25'},
             'Test 3': {'max_points': '9', 'weight': '0.2'}})

If you haven't seen defaultdict before, whatever you pass to the constructor is what is used as the value when you try to access a key that doesn't yet exist.

5 Comments

thanks. But can i do this without accessing those keys. Like can I do this without doing record['test 1']['weight'] = row['test 1']?
If you don't want to use pandas you wll have to accept this answer
@JoshSmeaton Thanks. But it does not work for me. I got this result if I do not use pprint: defaultdict(<class 'dict'>, {})
@JoshSmeaton When I use pprint, I still get an empty dictionary. Do you know how to fix it?
I now get where the empty dict comes from. I used row.pop('Student') in previous lines. All fixed! Thanks so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.