4

I have a 27GB CSV file and I want to simply rename the header rows. Can I do this without reading the entire file into a dataframe and then writing the entire file again?

This is essentially what I want to do, but without re-writing the whole 27GB file.

data = pd.read_csv(filename,sep="|",nrows=2)
data.head()

LOC_ID  UPC FW  BOP_U   BOP_$
0   17  438531560821    201712  1   40.0
1   239 438550152328    201719  2   28.8


data.columns = ['WHSE','SKU','PERIOD','QUANTITYONHAND','DOLLARSONHAND']
data.head()


   WHSE           SKU  PERIOD  QUANTITYONHAND  DOLLARSONHAND
0    17  438531560821  201712               1           40.0
1   239  438550152328  201719               2           28.8
5
  • Check here Commented Feb 17, 2017 at 15:21
  • So you want to change the header in the file, on the file system? Commented Feb 17, 2017 at 15:24
  • There are certainly easier ways to do this than Pandas or even Python. Commented Feb 17, 2017 at 15:25
  • It seems you have to rewrite file - info Commented Feb 17, 2017 at 15:30
  • 1
    This is best suited for commandline-like, shell script, instead of using python/pandas just for this. Commented Feb 17, 2017 at 15:40

1 Answer 1

1

Just specify there being only a single row with nrows.

header_df = pd.read_csv('my_file.csv', index_col=0, nrows=1)

As for re-writing the file, I don't think you'll get around having to process the entire file to re-write.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.