2

I am using pandas and I am trying to read an excel file with multiple sheets.

pd.read_excel('PATH', sheet_name=)

I only want to read the sheets with the following pattern: An nnnn, where n is a digit number. Additionally this file will be updated in the future, so writing the sheetnames one by one is not a good option.

Is it possible, and if yes, how to read multiple excel sheets with the pattern name I described?

2 Answers 2

6

You can first get a list of the excel sheet names using the ExcelFile class (and the sheet_names attribute):

xl = pd.ExcelFile('foo.xlsx')

xl.sheet_names  # see all sheet names

Once you have that you can select from there the sheets that match your pattern:

import re
import pandas as pd

dataframes = []
for sheet in xl.sheet_names:
    if re.match('A\d \d{4}', sheet):  # when matching pattern add the dataframe to the list
        dataframes.append(pd.read_excel('foo.xlsx', sheet_name=sheet))

You will have all dataframes in a list and can continue your code from there

Sign up to request clarification or add additional context in comments.

1 Comment

Great! Exactly what I wanted. I just thought that pandas package had an input that would allow to search for the pattern. Usually the for loop consumes more time than a pandas function.
1

You can first search for all files that match a regex pattern, then load each file in with pandas.

from pathlib import Path
import re


directory = Path('your/directory/of/csvs/')

file_list = []
for x in directory.iterdir():
   if re.match('A\d \d{3}', x.name):
    pd.read_excel('PATH', sheet_name=x.name)

Note, I havent tested that regex.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.