0

I'm trying to iterate over a list of rows in a table and do modify a string in one of the columns:

# python 2.7
import csv
import re

with open('root_diff.txt', 'rU') as dmr:
    coordinates_tsv = csv.reader(dmr, delimiter='\t')
    coordinates_list = [row for row in coordinates_tsv]

    for row in coordinates_list:
        cut = re.split(':|-|r', row[3])
        print cut[1]

But I get the following error: IndexError: list index out of range

The string in row[3] looks something like this: chr1:594572-598657. I want to split it so it looks like this: ['ch', '1', '594572', '598657'], and do something with the second and third numbers.

5
  • 3
    You do not need to create coordinates_list; just loop directly over coordinates_tsv: for row in coordinates_tsv. Commented Oct 23, 2013 at 10:26
  • What line throws the exception? Can you include the full traceback please? Commented Oct 23, 2013 at 10:27
  • Thanks, seems obvious now! Although that wasn't the cause of the error I'm getting Commented Oct 23, 2013 at 10:28
  • That is why I posted that as a comment, not an answer. Without the traceback, I can only guess which one of the two lines is the problem here. Commented Oct 23, 2013 at 10:29
  • Do you have any blank lines in the file? Or a rows that doesn't have 4th column? If row[3] doesn't match your pattern, you won't get an error - I suspect this is your actual problem. Commented Oct 23, 2013 at 10:31

1 Answer 1

2

There must be at least one value for row[3] that doesn't contain any of the characters to split on.

To debug, catch the IndexError and print cut and / or row[3] to see what is going on:

try:
    print cut[1]
except IndexError:
    print '-- unexpected input --', row[3]

If this is the header, skip it with next():

with open('root_diff.txt', 'rU') as dmr:
    coordinates_tsv = csv.reader(dmr, delimiter='\t')

    next(coordinates_tsv, None)  # skip first row, the header

    for row in coordinates_tsv:

Note that in theory, it could also still be the previous line that is throwing this exception; you didn't share the traceback in your post. A blank line or a line with fewer columns would lead to an IndexError for row[3]. A blank line gives an empty list, for example.

Sign up to request clarification or add additional context in comments.

1 Comment

There is a header, I deleted it & it seems to have fixed the problem. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.