1

System info:
Python 2.7.2
MAC OSX 10.7.2

Problem (+background):
I have a large '.csv' file (~1 gig) which needs some minor editing. Every value in the 5th column needs to be 5 characters long (some are 4 characters long, and need a '0' placed in front of them). The code (shown below) reports no errors when run, but stops writing with approximately 100 lines in the file left (thereby losing some crucial data!). Anyone know why this is happening?

I've re-created the 'read_file.csv' and inspected it, but I don't see anything out of place. The code always aborts in the same location, but I don't understand why?

import csv

path = '/Volumes/.../'

r = csv.reader(open(path + 'read_file.csv','rU'))
f =  open(path + 'write_file.csv', 'wb')

writer = csv.writer(f)

for line in r:

    if len(line[5]) == 4:
        line[5] = '0' + line[5]

    writer.writerow((line[0],line[1],line[2],line[3],line[4],line[5],line[6],line[7]))
2
  • why opening file in binary mode? Commented Nov 2, 2011 at 22:18
  • Can you give an example of what the file looks like? Commented Nov 2, 2011 at 22:19

3 Answers 3

1

Either close the output file after writing it, or write the output in a with context which will always close the file even if an error occurs:

with open('path + 'write_file.csv', 'wb') as f:
    writer = csv.writer(f)
    for line in r:
        ...
Sign up to request clarification or add additional context in comments.

Comments

0

Things to check:

  • Are you examining this after your code has exited so you know the file has been .close() or .flush()ed?

  • Is it possible you have something odd in your data on that line that makes it think the rest of the file is data in a field?

  • You're only saving a set number of columns of your line; you might try writer.writerow(line) instead...

Comments

0

Ensure that the file is properly closed, with makes it easy.

with open('test.csv', 'rU') as inp:
    csvin=csv.reader(inp)
    with open('output.csv', 'wb') as outp:
        csvout=csv.writer(outp)
        for line in csvin:
            csvout.writerow(line[:4] + [line[4].rjust(5, '0')] + line[5:])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.