2

I have a dataframe which i created from merging one column from 7 different excel file. Below is the code i used:

import pandas as pd
import glob

my_excel_files = glob.glob(r"C:\Users\.........\*.xlsx")

total_dataframe = pd.DataFrame() 

for file in my_excel_files:
    new_df = df['Comments']
    total_dataframe = pd.concat([total_dataframe, new_df], axis=1) # Puts together all Comments columns

As you can see from the code i grab the 'Comments' column from each excel and put them together into a new df, the only issue is i want to be able to add the filename into the column name so i know which column comes from which excel file, all of them are just called 'Comments' right now. So ideally one of the column headers would be 'Comments (first_response.xlsx)'

1 Answer 1

1

Let's use pathlib and pd.concat.

Using a dict comprehension we can grab the .name attribute from the pathlib object and when using concat the filename will be set as the index:

from pathlib import Path

dfs = pd.concat({f.name : pd.read_excel(f) for f in Path(r'C:\Users\..').glob('*.xlsx')})

This will create an index with the file name, you can reset_index if you want to place it as a column.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.