2

I have a folder of Excel files, many of which have 3-4 tabs worth of data that I just want as individual Excel files. For example, let's say I have an Excel file with three tabs: "employees", "summary", and "data". I would want this to create 3 new Excel files out of this: employees.xlsx, summary.xlsx, and data.xlsx.

I have code that will loop through a folder and identify all of the tabs, but I have struggling to figure out how to export data individually from each sheet into its own Excel file. I have gotten to the point where I can loop through the folder, open each Excel file, and find the name of each sheet. Here's what I have so far.

import pandas as pd
import os

# filenames
files = os.listdir()    
excel_names = list(filter(lambda f: f.endswith('.xlsx'), files))

excels = [pd.ExcelFile(name, engine='openpyxl') for name in excel_names]
sh = [x.sheet_names for x in excels] # I am getting all of the sheet names here
for s in sh:
    for x in s:
        #there is where I want to start exporting each sheet as its own spreadsheet

#df.to_excel("output.xlsx", header=False, index=False) #I want to eventually export it obviously, this is a placeholder
2
  • Does this need to be done using Python? It could easily be done using VBA. Commented Mar 11, 2021 at 13:44
  • In this particular case it does need to be done in Python. VBA might be more practical, but it's a long story involving organizational rules that I will spare you. Commented Mar 11, 2021 at 13:48

1 Answer 1

2
import pandas as pd
import glob

# get the file names using glob 
# (this assumes that the files are in the current working directory)
excel_names = glob.glob('*.xlsx')
# iterate through the excel file names
for excel in excel_names:
    # read the excel file with sheet_name as none
    # this will create a dict
    dfs = pd.read_excel(excel, sheet_name=None)
    # iterate over the dict keys (which is the sheet name)
    for key in dfs.keys():
        # use f-strings (only available in python 3) to assign 
        # the new file name as the sheet_name
        dfs[key].to_excel(f'{key}.xlsx', index=False)
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you, I'm getting a xlrd.biffh.XLRDError: Excel xlsx file; not supported error. Is there a way to do this using openpyxl instead of XLRD? I haven't worked with Glob much so I don't know if I can change this setting.
@tenebrissilentio what version of pandas are you using? print(pd.__version__) Is the error thrown on the pd.read_excel(...) portion of the code?
Apparently too old of a version. I updated Pandas and it worked. I figured during the course of 7 months, updating Pandas couldn't have been the issue, but boy was I wrong. That did it. Thanks so much.
@tenebrissilentio You're welcome and good luck.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.