How to replace a column using Python's built-in .csv writer module?

Question

I need to do a find and replace (specific to one column of URLs) in a huge Excel .csv file. Since I'm in the beginning stages of trying to teach myself a scripting language, I figured I'd try to implement the solution in python.

I'm having trouble with the "replace" part of the solution. I've read the official csv module documentation about how to use the writer, but there isn't really a clear enough example for me (yes, I'm slow). So, now for the question: how does one iterate through the rows of a csv file with a writer object?

p.s. apologies in advance for the clumsy code, I'm still learning :)

import csv

csvfile = open("PALTemplateData.csv")
csvout = open("PALTemplateDataOUT.csv")
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
writer = csv.writer(csvout, dialect)

total=0;
needchange=0;
changed = 0;
temp = ''
changeList = []

for row in reader:
    total=total+1
    temp = row[len(row)-1]
    if '/?' in temp:
        needchange=needchange+1;
        changeList.append(row.index)

for row in writer:           #this doesn't compile, hence the question
    if row.index in changeList:
        changed=changed+1
        temp = row[len(row)-1]
        temp.replace('/?', '?')
        row[len(row)-1] = temp
        writer.writerow(row)

print('Total URLs:', total)
print('Total URLs to change:', needchange)
print('Total URLs changed:', changed)

no, it's the exact same file as the input file (has all of the same data) I just didn't want to accidentally overwrite anything I needed — ignorantslut
– ignorantslut, Commented Jun 19, 2009 at 17:54
What does "accidentally overwrite" mean? Usually we read one file and write a different file. — S.Lott
– S.Lott, Commented Jun 19, 2009 at 18:19
I guess you're right - it is pretty silly (not to mention bad "practice") to have that many copies. — ignorantslut
– ignorantslut, Commented Jun 19, 2009 at 18:23

David Z · Accepted Answer · 2009-06-19 18:01:33Z

6

The reason you're getting an error is that the writer doesn't have data to iterate over. You're supposed to give it the data - presumably, you'd have some sort of list or generator that produces the rows to write out.

I'd suggest just combining the two loops, like so:

for row in reader:
    row[-1] = row[-1].replace('/?', '?')
    writer.writerow(row)

And with that, you don't even need total, needchange, and changeList. (There are a bunch of optimizations in there that I unfortunately don't have time to explain, but I'll see if I can edit that info in later)

answered Jun 19, 2009 at 18:01

David Z

133k29 gold badges264 silver badges284 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Sean Cavanagh Over a year ago

Keep in mind that doing this will overwrite your output file as you go. This is the typical way of doing this kind of thing though. Starting with two copies of the file, like you have above, isn't the best practice.

S.Lott Over a year ago

+1: For that matter, you generally don't need the sniffer or the seek(0), either.

ignorantslut Over a year ago

That certainly is an elegant solution, however, I don't think you can use the same 'row' reference in both the for loop iteration and as the parameter of writerow(). The interpreter says: "io.UnsupportedOperation: BufferedReader.write() not supported"

S.Lott Over a year ago

@ignorantslut: write() certainly is not supported. writerow(), however, should be supported. Check your code to be sure you spelled the method name correctly.

ignorantslut Over a year ago

@S. Lott: I copied and pasted directly from David's example. Full traceback: File "C:\Documents and Settings\g41092\My Documents\palScript.py", line 21, in <module> writer.writerow(row) File "C:\Program Files\Python\lib\io.py", line 1495, in write self.buffer.write(b) File "C:\Program Files\Python\lib\io.py", line 701, in write self._unsupported("write") File "C:\Program Files\Python\lib\io.py", line 322, in unsupported (self.__class_.__name__, name)) io.UnsupportedOperation: BufferedReader.write() not supported

|

knabar · Accepted Answer · 2009-06-19 18:02:16Z

1

You should only have one loop and read and write at the same time - if your replacements only affect one line at a time, you don't need to loop over the data twice.

for row in reader:
  total=total+1
  temp = row[len(row)-1]
  if '/?' in temp:
    temp = row[len(row)-1]
    temp.replace('/?', '?')
    row[len(row)-1] = temp
  writer.writerow(row)

This is just to illustrate the loop, not sure if the replacement code will work like this.

answered Jun 19, 2009 at 18:02

knabar

1,1561 gold badge8 silver badges12 bronze badges

Comments

Greg · Accepted Answer · 2009-07-21 15:10:01Z

0

Once you have your csv in a big list, one easy way to replace a column in a list would be to transpose your matrix, replace the row, and then transpose it back:

mydata = [[1, 'a', 10], [2, 'b', 20], [3, 'c', 30]]

def transpose(matrix):
    return [[matrix[x][y] for x in range(len(matrix))] for y in range(len(matrix[0]))]

transposedData = transpose(mydata)
print transposedData
>>> [[1, 2, 3], ['a', 'b', 'c'], [10, 20, 30]]

editedData = transposedData[:2] + [50,70,90]
print editedData
>>> [[1, 2, 3], ['a', 'b', 'c'], [50, 70, 90]]

mydata = transpose(editedData)
print mydata
>>> [[1, 'a', 50], [2, 'b', 70], [3, 'c', 90]]

answered Jul 21, 2009 at 15:10

Greg

2,6254 gold badges23 silver badges28 bronze badges

Collectives™ on Stack Overflow

How to replace a column using Python's built-in .csv writer module?

3 Answers 3

6 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related