1

Here is my problem. I need to parse a comma separated file and I've got my code working how I would like, however while testing it and attempting to break things I've come across a problem.

Here is the example code:

import csv
compareList=["testfield1","testfield2","testfield3","testfield4"]
z=open("testFile",'r')
x=csv.reader(z,quotechar='\'')
testDic={}
iter=0
for lineList in x:
    try:
        for item in compareList:
            testDic[item]=lineList[iter]
            iter+=1
        iter=0
    except IndexError:
        iter=0
        lineList=[]
        for item in compareList:
            testList.append("")
            testDic[item]=lineList[iter]
            iter+=1
        iter=0

    for item in compareList:
        testFile.write(testDic[item])
        if compareList.index(item)!=len(compareList)-1
            testFile.write(",")
    testFile.write('\n')
testFile.close()
z.close()

So what this is supposed to do is check and make sure that each line of the csv file matches the length of a list. If the length of the line does not match the length of the list, then the line is converted to null values(commas) that equal the length of compareList. Here is an example of what is in the file:

,,"sometext",343434
,,"moretext",343434
,,"stuff",4543343
,,"morestuff",3434354

The code works just fine if the line is missing an item. So the output of at file containing:

,"sometext",343434
,,"moretext",343434
,,"stuff",4543343
,,"morestuff",3434354

will look like this:

,,,,
,,"moretext",343434
,,"stuff",4543343
,,"morestuff",3434354

The problem I have induced is when the line looks something like this:

,"sometext",343434
,,"moretext",343434
,,"St,'",uff",4543343
,,"morestuff",3434354

The ouput of this file will be:

,,,,
,,"moretext",343434
,,,,

So it will apply the change as expected and null out lines 1 and 3, but it just stops processing at that line. I've been pulling my hair out trying to figure out what is going on here, with no luck.

As always I greatly appreciate any help you are willing to give.

1
  • Your CSV is malformed in the "problem" file... Commented Feb 3, 2017 at 15:55

1 Answer 1

1

Just print each line returned by csv.reader to understand what is the problem:

>>> import csv
>>> z=open("testFile",'r')
>>> x=csv.reader(z,quotechar='\'')
>>> for lineList in x:
...     print lineList
...
['', '"sometext"', '343434']
['', '', '"moretext"', '343434']
['', '', '"St', '",uff",4543343\n,,"morestuff",3434354\n']

The last 2 lines are just one line for csv.reader. Now, just remove quotechar='\''

>>> import csv
>>> z=open("testFile",'r')
>>> x=csv.reader(z)
>>> for lineList in x:
...     print lineList
...
['', 'sometext', '343434']
['', '', 'moretext', '343434']
['', '', "St,'", 'uff"', '4543343']
['', '', 'morestuff', '3434354']
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, sir. I get it. The problem occurs before I even start applying my code to the csv. However, If I remove the quotechar it will not process my files correctly. What would be a good way to handle malformed lines?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.