0

I am trying to combine multiple CSVs in a folder then delete the first 3 columns. My code combines the files ok but I can't drop the columns. Can anyone see the issue?

import pandas as pd
import glob

path = r'C:\Users\****' # use your path
all_files = glob.glob(path + "/*.csv")

li = []

for filename in all_files:
    df = pd.read_csv(filename, index_col=None, header=0)
    li.append(df)

frame = pd.concat(li, axis=0, ignore_index=True)

frame.drop(frame.columns[[0, 1, 2]], axis=1)

print(frame)
frame.to_csv('out.csv', index=False)
2
  • 1
    Can’t you just read_csv without the first 3 columns usecols, with use pd.read_csv('foo.csv', usecols=[3:])? Commented Jun 20, 2021 at 11:55
  • 1
    You need to assign it back, frame = frame.drop(frame.columns[[0, 1, 2]], axis=1) is one way. Commented Jun 20, 2021 at 12:09

1 Answer 1

2

You can use . iloc[:,3] and select the columns index you desire:


from pathlib import Path
import pandas as pd

path = Path('C:\Users\****') # use your path
all_files = path.glob("*.csv")

frame = pd.concat(
(pd.read_csv(file_, index_col=None, header=0).iloc[:,3:]
for file_ in all_files), axis=0, ignore_index=True)

frame.to_csv('out.csv', index=False)
Sign up to request clarification or add additional context in comments.

3 Comments

Mmh! In Pandas context, it means all columns minus first 3. Columns are a list of index or actual names
I will find a PC. Mustafa, I think you are correct. I read documents and it appears that is is not a list list as order does not matter
Okay, change to .iloc[:, 3:] this will read all columns then select all but first three. Everything remains the same

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.