4

I am trying to make a simple script that concatenates or appends multiple column sets that I pull from xls files within a directory. Each xls file has a format of:

Index    Exp. m/z   Intensity   
1        1000.11    1000
2        2000.14    2000
3        3000.15    3000

Each file has varying number of indices. Below is my code:

import pandas as pd
import os
import tkinter.filedialog

full_path = tkinter.filedialog.askdirectory(initialdir='.')
os.chdir(full_path)

data = {}
df = pd.DataFrame()

for files in os.listdir(full_path):
    if os.path.isfile(os.path.join(full_path, files)):
        df = pd.read_excel(files, 'Sheet1')[['Exp. m/z', 'Intensity']]
        data = df.concat(df, axis=1)

data.to_excel('test.xls', index=False)

This produces an attributerror: DataFrame object has no attribute concat. I also tried using append like:

data = df.append(df, axis=1) 

but I know that append has no axis keyword argument. df.append(df) does work, but it places the columns at the bottom. I want something like:

Exp. m/z   Intensity       Exp. m/z   Intensity  
1000.11    1000            1001.43    1000
2000.14    2000            1011.45    2000
3000.15    3000

and so on. So the column sets that I pull from each file should be placed to the right of the previous column sets, with a column space in between.

1
  • 1
    Only typo - not df.concat(df, axis=1) but pd.concat(df, axis=1) Commented Jun 27, 2017 at 7:47

1 Answer 1

9

I think you need append DataFrames to list and then pd.concat:

dfs = []
for files in os.listdir(full_path):
    if os.path.isfile(os.path.join(full_path, files)):
        df = pd.read_excel(files, 'Sheet1')[['Exp. m/z', 'Intensity']]
        #for add empty column 
        df['empty'] = np.nan
        dfs.append(df)
data = pd.concat(dfs, axis=1)
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you! I thought I got used to pandas with appending/concatenating, but never thought about doing it this way.
a quick question: is there a way to add empty column in between files when concatenating?
I think simpliest is add it in loop before append(dfs), I edit answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.