0

I have quite a big CSV file where lines have varying length:

215080,49,3,0.0,22,42,0.0
215082,49,3,0.0,22,43,59.999 
215083,49,3,0.0,22,45,0.0
215085,49,3,0.0,22,46,59.999
215086,49,3,0.0,22,48,0.0
215087,49,3,0.0,22,49,0.001
215088,49,3,0.0,22,49,59.999
215089,49,3,0.0,22,51,0.0
215090,49,3,0.0,22,52,0.001
215688,49,1,59.999,22,49,0.001
215689,49,1,59.999,22,49,59.999
215690,49,1,59.999,22,51,0.0
215691,49,1,59.999,22,52,0.001
216291,49,1,0.001,22,51,0.0
216292,49,1,0.001,22,52,0.001
216293,49,1,0.001,22,52,59.999

I would like to replace, for example, only the fourth comma (,) in every line with a semicolon (;). How can I do this most efficiently?

3
  • 2
    The fourth comma only, or the fourth, eighth, 12th, and so on? Commented Dec 21, 2011 at 23:18
  • @Warren P Sorry if it is not clear, the fourth coma only as You can see, the fourth coma is separator between coordinates, latitude and longitude Commented Dec 23, 2011 at 20:02
  • English tip: "the fourth" is clear. Just the 4th. "Every fourth" means 4th, 8th, 12th, 16th, and so on. Commented Dec 23, 2011 at 20:12

3 Answers 3

7
import csv
with open('source.csv','rb') as source:
    rdr= csv.reader( source )
    with open('revised.csv','wb') as target:
        wtr= csv.writer( target )
        for r in rdr:
            wtr.writerow( (r[0], r[1], r[2], '{0};{1}'.format(r[3],r[4]), r[5], r[6]) )
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for Your solution...although I had to change the last line to: wtr.writerow( (r[0], r[1], r[2], '{0};{1}'.format(r[3],r[4]), r[5], r[6]) )
4

You could do something like this on each line of input.

tmp = line.split(',', 4)
newline = '%s;%s' % (','.join(tmp[:4]), tmp[4])

15 Comments

This would work unless it's a true CSV file with things like embedded quotes around fields that contain commas inside the literal string values. It works for the inputs that the OP specified, for instance.
@WarrenP: What do you mean by a "true" CSV file? I didn't think that there was a standard. Software that I've seen uses lots of different conventions for trying to have fields with commas in them a "CSV".
@Charles: There may not be One True CSV, but Python's csv module is configurable enough to understand several variations, including Excel's, which I would think is the dominant variant of CSV in the real world.
+1 I wasn't aware that you could pass a second parameter to split.
This code more readable and much faster than the csv version. If you know that input file doesn't contains extra commas you should go with the split version and don't over engineer stuff.
|
2

Another approach

>>> a = '215080,49,3,0.0,22,42,0.0'
>>> b = a.split(',')
>>> ','.join(b[0:3] + [b[3] + ';'  + b[4]] + b[5:])
'215080,49,3,0.0;22,42,0.0'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.