2

i have example text from Python which i a working on.

  Afghanistan:32738376
  Akrotiri:15700
  Albania:3619778
  Algeria:33769669
  American Samoa:57496
  Andorra:72413
  Angola:12531357
  Anguilla:14108
  Antigua and Barbuda:69842
  Argentina:40677348
  Armenia:2968586
  Aruba:101541
  Australia:20600856
  Austria:8205533
  Azerbaijan:8177717

I have this code to make a dictionary using the country names and population.

 dct = {}
  for line in infile:
    line = line.strip()
    words = line.split(":")
    countryname = words[0]

    population = int(words[1])
    dct[countryname] = population

When i print population, it prints all the values but then i get an population = int(words[1]) - IndexError: list index out of range. I don't understand how i am getting this error, especially as when i print countryname, it is absolutely fine, the error only occurs with population. Python has to access the same amount of lines for both variables but it seems like with population its trying to access more lines, which i do not understand because it doesn't do this for countryname. Any ideas on why this is occurring.

1
  • 4
    You probably have an empty line or so to which the split doesn't produce a second item. try printing each line (or viewing it with a debugger) and then fix your code to accommodate for the offending line. you can check if ':' is in the line before the split and if not skip that line. something like that Commented Feb 6, 2016 at 11:10

4 Answers 4

1

You assume that your file is perfect and this is wrong.

try:
    countryname = words[0]
    population = int(words[1])
    dct[countryname] = population
except IndexError:
    print("Impossible convert line: %s " % line)

I prefer use the log than the print statement in this case, but for the sake of the example I think it's ok. You should also print the line number if you want.

Anyway the purpose of the try/except is to avoid to break the program when the file doesn't respect the format you have in mind.

Sign up to request clarification or add additional context in comments.

Comments

1

There might be lines without the separator :. Try catching it

dct = {}
  for line in infile:
    line = line.strip()
    words = line.split(":")
    countryname = words[0]

    population = 0
    if words.__len__() > 1:
      population = int(words[1])

    dct[countryname] = population

1 Comment

don't use words.__len__() but len(words)
0

I would recommend that you add the following diagnostic into your code:

dct = {}
for line_number, line in enumerate(infile):
    line = line.strip()
    words = line.split(":")

    if len(words) != 2:
        print "Line {} is not correctly formatted - {}".format(line_number, line)
    else:
        countryname = words[0]
        population = int(words[1])
        dct[countryname] = population

This would then display which line numbers in your data have formatting problems, it would show something like:

Line 123 is not correctly formatted - Germany8205534
Line 1234 is not correctly formatted - Hungary8205535

Comments

0

Please Check your file content, Looks like somewhere in the file ':' is missing between the country name and the population:

rfile = open('a.txt', 'rw')
print dict([line.strip().split(':')for line in rfile.readlines()])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.