1

I'm parsing a csv file using Python.

The CSV file looks like:

value1,value2,value3(a,b,c)

The Python code:

with open(file_path, 'rb') as this_file:
  reader = csv.reader(this_file, delimiter=',')
  for row in reader:
    print row

Obviously the CSV reader interprets this as:

"value1","value2","value3(","a","b","c)"

What is the best way to stop Python breaking value2() into four values?

Thanks.

5
  • create the csv file correctly either with the csv module or in excel or open office ... if you do this the csv manager will properly escape nested comma's Commented Jul 29, 2013 at 21:30
  • 1
    So how do you want it to interpret it? As inconvenient as it may be you can just write something yourself and use split(). That's what I'd do if there's nothing you can do about properly formatting the csv file. Commented Jul 29, 2013 at 21:30
  • 3
    Your CSV file is badly misformatted. Put quotes around value3 to make it a valid CSV value. Commented Jul 29, 2013 at 21:32
  • Joran - I don't have control over the CSV generation. @aleksander-lidtke - I'd like it as val1, val2, val3(a,b,c). I was hoping to avoid split. Thanks for the replies! Commented Jul 29, 2013 at 21:33
  • 2
    And remove the spaces after the commas. In short, that's NOT a CSV file at all. You may need to write your own parser. Commented Jul 29, 2013 at 21:35

1 Answer 1

1

Here's a code that deals with the given example:

a='value1, value2, value3(a, b, c)'
split=a.split(', ')
result=[]
for ent in split:
    if ent.find('(', 0, len(ent))!=-1:
        temp=''
        for ent2 in split[split.index(ent):]:
            if ent2.find('(', 0, len(ent))!=-1:
                temp=temp+ent2
            else:
                temp=temp+','+ent2
                split.remove(ent2)
            #May need a check whether ) has not been reached yet, in which case don't add the items.
        result.append(temp)
    else:
        result.append(ent)

It will require some small checking if there exist some "normal" entries after the ones surrounded with the parentheses (as indicated in the comment), e.g.

a='value1, value2, value3(a, b, c)', 'value4'

Hope this helps. Apologies, I can't think of any way to use the in-built csv parser since your file is not, in fact, a "proper" csv...

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your detailed reply. I ended up using something similar to this. Value3 was a known string, so I was able to check for that and then use list[:n] and list[n:] to obtain the desired output. Thanks again!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.