0

I have the following code:

import glob
import pandas as pd
allFiles = glob.glob("C:\*.csv")
frame = pd.DataFrame()
list_ = []
for file_ in allFiles:
    print file_
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)
    frame = pd.concat(list_, sort=False)
print list_
frame.to_csv("C:\f.csv")

This combines multiple CSVs to single CSV.

However it also adds a row number column.

Input:

a.csv

a   b   c   d
1   2   3   4

b.csv

a   b   c   d
551 55  55  55
551 55  55  55

result: f.csv

    a   b   c   d
0   1   2   3   4
0   551 55  55  55
1   551 55  55  55

How can I modify the code not to show the row numbers in the output file?

2 Answers 2

2

Change frame.to_csv("C:\f.csv") to frame.to_csv("C:\f.csv", index=False)

See: pandas.DataFrame.to_csv

Sign up to request clarification or add additional context in comments.

Comments

1

You don't have to use pandas for this simple task. pandas is parsing the file and converting the data to numpy constructs, which you don't need... In fact you can do it with just normal text file manipulation:

import glob
allFiles = glob.glob("C:\*.csv")
first = True
with open('C:\f.csv', 'w') as fw:
    for filename in allFiles:
        print filename
        with open(filename, 'r') as f:
            if not first:
                f.readline() # skip header
            first = False
            fw.writelines(f)

2 Comments

This code also adds the header from each file. The header should be shown only once. The fw.writelines(f) needs to be conditional - write the row only if it's no the header row except the first time.
I fixed it with a flag @jack

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.