2

I have an Excel (.xlsx) file with 40 or so sheets. Each sheet has the same structure, but contains different data. I want to extract information from each sheet and consolidate it into one, with the information from each sheet stacked one over the other. The information I need to extract from each are two:

  1. Sheet name, found always in cell E3
  2. Cell area of interest, found always between rows 72-85 and columns E-V

This extracted information would be pasted in rows 2-15 of the consolidated sheet with the sheet name in one column and all other information in the columns next to it. Then, information from the next sheet extracted would be pasted in rows 16-29 and so forth.

I’m new to Python and I am running Python 3.6.1. I have found solutions on how to combine Excel files using Python, but nothing for extracting pieces of information from specified cells in an Excel sheets.

Any guidance would be very helpful.

Update 1: I managed to load into a data frame the area of one of my sheets using.

import pandas as pd
cis = pd.ExcelFile(r"C:\mypath\myfile.xlsx") 
df1 = cis.parse('1. Consumers', skiprows=[0,1,2,3,4,5,6],parse_cols=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]).drop([13])

However, I still have to add the sheet name to the data frame and then run this same loop across all of the sheets to then finally append the information from each sheet together in one dataframe/file.

9
  • 1
    Do all the sheets have same columns ? Commented Oct 1, 2017 at 22:10
  • @muazfaiz. Yes they do. Commented Oct 1, 2017 at 22:23
  • That would take 2 minutes to set up with Power Query, an add-in for 2010 and 2013 and built into 2016 as Get and Transform. Commented Oct 1, 2017 at 22:23
  • @teylyn Thanks for the tip. However, I was hoping to do this using a Python script, mainly because it will need to be used several times across different file sets. Commented Oct 1, 2017 at 22:26
  • Another reason to use Power Query. It can be set up once and then repeated many times, over different file sets. Commented Oct 1, 2017 at 22:30

2 Answers 2

3

This should be enough to get you started:

xls = pd.ExcelFile('excel_file_path.xls')
for sheet in xls.sheet_names:
    df = pd.read_excel(file_name, sheetname=sheet)

If you're already pretty familiar with excel, you'll pick up pandas quick. You should absolutely look into using jupyter notebook with python. Jupyter notebook will essentially create a spreadsheet like UI with tons of baked in functionality for running analysis and manipulating data.

References:

Use PANDAS to read multiple sheets from same workbook

Link to jupyter notebook documentation

Pandas tutorials

Practical business solutions with pandas for python

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for this. Another user mentioned 'openpyxl'. Is there a reason to prefer pandas over openpyxl? I can't figure out how to extract particular cells using pandas.
See my update. Does your code help me loop through files?
@StatsScared yes, if all sheets have the same structure, you should be able to handle all sheets with one simple loop. for selecting specific rows columns look into: pandas.pydata.org/pandas-docs/stable/indexing.html
please accept answer and if you have more questions for other parts of this process, check what's already been asked, usually its already on here, otherwise post a question specific to your new issue. there is a huge community of pandas users on here constantly answering questions. i rarely have the chance to get to a question before someone else does.
0

This would give you a single dataframe, and a 'Sheet' column that identifies which sheet it comes from:

def pd_read_excel_sheets_to_df(path):
    sheets_dictionary = pd.read_excel(path, sheet_name=None)
    for sheet_name, sheet in sheets_dictionary.items():
        sheet['Sheet'] = sheet_name
    return pd.concat(sheets_dictionary.values(), ignore_index=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.