3

I am looping through Excel worksheets and appending them to a list. When the loop finishes, I use Pandas to concat to a single dataframe. The problem I'm having is adding the worksheet name into the appropriate list.

# infile is a filepath variable    
xls = xlrd.open_workbook(infile, on_demand=True)



dfList = []
for sheet_name in xls.sheet_names():
    df = pd.read_excel(infile, sheet_name, header = 0)
    #df['Well_name'] = sheet_name
    dfList.append(df)
    print(sheet_name + " appended.")
    #time.sleep(2)
print("Loop complete")
# Concatenating the appended lists
dfs = pd.concat(dfList, axis=0)

I tried creating a new column in df but that created a length mismatch and it also didn't work because it was constantly rewritten to the last worksheet name in the loop.

Any thoughts or suggestions?

0

1 Answer 1

3

Seems like you are meeting some scoping issues. One way to avoid this problem is to use a list comprehension. You can also use pd.DataFrame.assign to add a series within your list comprehension:

dfList = [pd.read_excel(infile, sheet_name, header=0).assign(Well_name=sheet_name) \
          for sheet_name in xls.sheet_names()]

dfs = pd.concat(dfList, axis=0)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @jpp this did exactly what I wanted!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.