18

I have a loop that read Excel sheets in a document. I want to store them all in a list:

  DF_list= list()

  for sheet in sheets:
     df= pd.read_excel(...)
     DF_list = DF_list.append(df)

If I type:

[df df df df]

it works.

Sorry I have a Matlab background and not very used to Python, but I like it. Thanks.

2
  • 1
    DataFrame is an important class in pandas: pandas.pydata.org/pandas-docs/stable/generated/… . You should name your objects something else like df / mydf etc. Commented Jan 11, 2018 at 16:11
  • Before attempting to write any code please read the Python documentation. In your case, you should read the docs of list. Or just run help([].append) from the interpreter. Commented Jan 11, 2018 at 16:11

5 Answers 5

28

.append() modifies a list and returns None. You override DF_list with None in your first loop and the append will fail in the second loop.

Therefore:

DF_list = list()

for sheet in sheets:
    DF_list.append(pd.read_excel(...))

Or use a list comprehension:

DF_list = [pd.read_excel(...) for sheet in sheets] 
Sign up to request clarification or add additional context in comments.

1 Comment

To be fair, he also gives an explanaition why the code did not work. Which is pretty usefull for a beginner.
14

Try this

DF_list= list()

for sheet in sheets:

   df = pd.read_excel(...)

   DF_list.append(df)

or for more compact python, something like this would probably do

DF_list=[pd.read_excel(...) for sheet in sheets]

Comments

4

If you will use parameter sheet_name=None:

dfs = pd.read_excel(..., sheet_name=None)

it will return a dictionary of Dataframes:

sheet_name : string, int, mixed list of strings/ints, or None, default 0

    Strings are used for sheet names, Integers are used in zero-indexed
    sheet positions.

    Lists of strings/integers are used to request multiple sheets.

    Specify None to get all sheets.

    str|int -> DataFrame is returned.
    list|None -> Dict of DataFrames is returned, with keys representing
    sheets.

    Available Cases

    * Defaults to 0 -> 1st sheet as a DataFrame
    * 1 -> 2nd sheet as a DataFrame
    * "Sheet1" -> 1st sheet as a DataFrame
    * [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
    * None -> All sheets as a dictionary of DataFrames

Comments

3

Complete solution is as follows:

# (0) Variable containing location of excel file containing many sheets
excelfile_wt_many_sheets = 'C:\this\is\my\location\and\filename.xlsx'

# (1) Initiate empty list to hold all sheet specific dataframes
df_list= []

# (2) create unicode object 'sheets' to hold all sheet names in the excel file
df = pd.ExcelFile(excelfile_wt_many_sheets)
sheets = df.sheet_names

# (3) Iterate over the (2) to read in every sheet in the excel into a dataframe 
#     and append that dataframe into (1)
for sheet in sheets:
    df = pd.read_excel(excelfile_wt_many_sheets, sheet)
    df_list.append(df)

Comments

2

Actually there's no need to define new list to store bunch of dataframes. The pandas.ExcelFile function applied on excel file with multiple sheets returns ExcelFile object which is a collection that can catch hold bunch of dataframes together. Hope the below code helps.

import pandas as pd

df = pd.ExcelFile('C:\read_excel_file_with_multiple_sheets.xlsx')

sheet_names_list = df.sheet_names

for sheet in sheet_names_list:
   df_to_print = df.parse(sheet_name=sheet)
   print(df_to_print)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.