0

The file students.csv contains a list of students registred for a graduate course in chemistry. Write a function called createStudentDict()that opens this file and populates a dictionary with all of the students. The key should be the student ID which is present in the first column. This student ID key should be recorded as a string. The value should be a list in which the first item is the student's name (should be stored as a string), the second item is the student's age (integer), and the third item is the student's current full-time occupation (string).

Here are the contents of the file :

7373    Walter White    52  Teacher
8274    Skyler White    49  Author
9651    Jesse Pinkman   27  Student
2213    Saul Goodman    43  Lawyer
6666    Gus Fring   54  Chicken Guy
8787    Kim Wexler  36  Lawyer
9999    Tuco Salamanca  53  Drug Lord

I have tried writing the function and running it? I'm a programming beginner so I'm not sure what to put here except that I've defined the function made the try/except block and the code is not running. I am not sure if there are any errors other than the index.

def createStudentDict():
  try:
    #Open the file    
    f=open("students.txt","r")
  except:
    #Print error message if file is not pesent
    print("File is not present")
  #Read the content of the file
  fileContent = f.read()
  #Splits the line by using the split method
  lines = fileContent.split("\n")
  #Create dictionary
  dict = {}
  #Iterate through all the line of the file

  for i in range(0,len(lines)):
    #Split line by using the comma as seperator
    detailList = lines[i].split(',')
    #Create list with the student name, age and profession
    studentDetailList = [detailList[1], int(detailList[2]), detailList[3]]
    #Add or update the item in the dictionary
    dict.update({detailList[0]:studentDetailList})
  return dict
print(createStudentDict())

The exception is :

    Traceback (most recent call last):
      File "C:/Users/Owner/Documents/401 python/JONES ASSIGNMENT 3.py", line 47, in <module>
        print(createStudentDict())
      File "C:/Users/Owner/Documents/401 python/JONES ASSIGNMENT 3.py", line 37, in createStudentDict
        studentDetailList = [detailList[1], int(detailList[2]), detailList[3]]
    IndexError: list index out of range

This is the error I'm receiving. This is the expected output Invoking the function like so: print(createStudentDict()) Should generate the following output:

{'7373': ['Walter White', 52, 'Teacher'], '8274': ['Skyler White', 49, 'Author'], '9651': ['Jesse Pinkman', 27, 'Student'], '2213': ['Saul Goodman', 43, 'Lawyer'], '6666': ['Gus Fring', 54, 'Chicken Guy'], '8787': ['Kim Wexler', 36, 'Lawyer'], '9999': ['Tuco Salamanca', 53, 'Drug Lord']}
3
  • It seems like your file is not comma separated but you are splitting on comma. Since there is no comma you will not get 1,2,3 index. You might want to split by tab instead of comma. Commented Aug 14, 2019 at 13:52
  • 2
    @Nathan He has dictionary key as index 0. He doesn't have comma separated file. It is tab separated. I believe splitting by tab should fix the problem Commented Aug 14, 2019 at 13:54
  • Ok I changed the range to (l, len(lines)) and split by ('\t') and am still getting the same error. Commented Aug 14, 2019 at 14:35

2 Answers 2

4

It appears that the CSV doesn't use a comma separator, but a tab one, try this

 detailList = lines[i].split('\t')

Since there is no commas you probably get a list of length 1, which is why you get the index error, for future practice you can try and print variables, or better yet, use an IDE like Pycharm and it's debugging mode

EDIT: To comply with your given example I made the following modifications:

for i in range(0,len(lines)):
    # Set the intervals to hold the same number of spaces
    line = lines[i].replace("    ", "  ")
    # Now all the spaces are double white space, split by double white space
    detailList = line.split('  ')

and the output was

{'7373': ['Walter White', 52, 'Teacher'], '8274': ['Skyler White', 49, 'Author'], '9651': ['Jesse Pinkman', 27, 'Student'], '2213': ['Saul Goodman', 43, 'Lawyer'], '6666': ['Gus Fring', 54, 'Chicken Guy'], '8787': ['Kim Wexler', 36, 'Lawyer'], '9999': ['Tuco Salamanca', 53, 'Drug Lord']}

By the way, notice that you use the variable named dict while dict is a keyword in python, which is a bad practice that could lead to some unexpected behavior, you could rename it to dict1 or result_dict

Sign up to request clarification or add additional context in comments.

6 Comments

So I could attempt to debug this in IDLE?
Yes, here is a nice little guide for debugging using IDLE, although I personally prefer PyCharm, and it's community edition is free of charge
Btw, It's hard to tell if it's a bad formatting of the CSV or the way it had manifested when you copied it here, but seems to be using 4 white space (not tab space) for all columns except the last, where it uses 2, could you open the CSV and tell? I'll edit my answer to work with your given example
it is tab space for all of them, it is how it manifested. I mean I can make it separated by commas and not tab space if that makes things easier.
['7373\tWalter White\t52\tTeacher', '8274\tSkyler White\t49\tAuthor', '9651\tJesse Pinkman\t27\tStudent', '2213\tSaul Goodman\t43\tLawyer', '6666\tGus Fring\t54\tChicken Guy', '8787\tKim Wexler\t36\tLawyer', '9999\tTuco Salamanca\t53\tDrug Lord', ''] {'7373': ['Walter White', 52, 'Teacher'], '8274': ['Skyler White', 49, 'Author'], '9651': ['Jesse Pinkman', 27, 'Student'], '2213': ['Saul Goodman', 43, 'Lawyer'], '6666': ['Gus Fring', 54, 'Chicken Guy'], '8787': ['Kim Wexler', 36, 'Lawyer'], '9999': ['Tuco Salamanca', 53, 'Drug Lord']} >>> Why am I getting this twice?
|
0

There are probably bad formatted lines or maybe the CSV header itself that is breaking your loop. Try putting everything inside the for loop in a try / except clause and inside the except print the line so you know which one it is. If it is the header, you can use range(1, len(lines)) to ignore the first one. Also, check Python’s built in CSV processor module. If the file is badly formatted in general, you can manually process the first line to see what's the output of detailList = lines[i].split(','), and change your format or your code accordingly.

4 Comments

I changed the range to (1, len(lines)) and changed the split to the '\t' as stated above and I am still getting the index out of range error.
You need to print out lines to see what they look like.
can you rephrase that please? I am positive I am not understanding you correctly. Can you go further on what would be in the try/except block? Are you talking about putting the entire for loop in a try block and then except print(lines)?
['7373\tWalter White\t52\tTeacher', '8274\tSkyler White\t49\tAuthor', '9651\tJesse Pinkman\t27\tStudent', '2213\tSaul Goodman\t43\tLawyer', '6666\tGus Fring\t54\tChicken Guy', '8787\tKim Wexler\t36\tLawyer', '9999\tTuco Salamanca\t53\tDrug Lord', ''] {'7373': ['Walter White', 52, 'Teacher'], '8274': ['Skyler White', 49, 'Author'], '9651': ['Jesse Pinkman', 27, 'Student'], '2213': ['Saul Goodman', 43, 'Lawyer'], '6666': ['Gus Fring', 54, 'Chicken Guy'], '8787': ['Kim Wexler', 36, 'Lawyer'], '9999': ['Tuco Salamanca', 53, 'Drug Lord']} >>> Hey, why is my work giving me this twice now?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.