9

Per https://github.com/pandas-dev/pandas/pull/21251/files/09e5b456e1af5cde55f18f903ab90c761643b05a, we should be able to append DataFrames to new XLSX sheets.

Based on the documentation, I tried the following:

>>> import pandas as pd
>>>                
... d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'], 
...                "B":['5', '10', '20']})
>>> d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'], 
...                "B":['1', '2', '3']})
>>> 
>>> # Create XLSX document for ticker
... writer = pd.ExcelWriter('test.xlsx',engine='openpyxl')
>>> d1.to_excel(writer,sheet_name='d1')
>>> writer.save()
>>> 
>>> writer = pd.ExcelWriter('test.xlsx',engine='openpyxl', mode='a')
>>> d2.to_excel(writer,sheet_name='d2')
>>> writer.save()
>>> 
>>> pd.__version__
'0.23.4'     # Just updated this per a comment
>>> 
>>> 

The result is a single workbook named 'test.xlsx' with a single tab 'd2'.

How can I prevent the workbook/sheet form being overwritten?

6
  • Possible duplicate of How to write to an existing excel file without overwriting data (using pandas)? Commented Jan 14, 2019 at 17:39
  • 1
    Sorry, a bit too quick to mark as duplicate. This was introduced in pandas version 0.23.1 so you need to upgrade. Commented Jan 14, 2019 at 17:41
  • Thanks, I upgraded per your suggestion. I was using "pip3 install pandas" instead of adding "--upgrade" so I thought I had the latest version. Per your suggestion, I updated but am still getting the same result. Commented Jan 14, 2019 at 17:48
  • Also, the other referenced <stackoverflow.com/questions/20219254/…> is for the function 'df.to_excel()', not 'pd.ExcelWriter'. Commented Jan 14, 2019 at 17:51
  • Are you running this on a REPL? Make sure you restart your shell since the pandas module is loaded in memory with the older version. Commented Jan 14, 2019 at 17:51

4 Answers 4

11

You can use with:

with pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a') as writer:
    d1.to_excel(writer,sheet_name='d1')
    d2.to_excel(writer,sheet_name='d2')
    writer.save()

writer.close()

update

This should work just note that the a blank file needs to be created before hand. You can just create a blank file using python if you want. I created a simple loop to, in some ways, mimic the essence of what you are trying to accomplish:

import pandas as pd
from openpyxl import load_workbook

d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'], 
               "B":['5', '10', '20']})
d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'], 
                "B":['1', '2', '3']})

dfs = [d1,d2]

for i in range(len(dfs)):
    sheet = 'd'+str(i+1)
    data = dfs[i]
    writer = pd.ExcelWriter('atest.xlsx',engine='openpyxl', mode='a')
    writer.book = load_workbook('atest.xlsx') # here is the difference
    data.to_excel(writer,sheet_name=sheet)
    writer.save()
    writer.close()

or here is the modified first example:

d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'], 
               "B":['5', '10', '20']})
d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'], 
                "B":['1', '2', '3']})

writer = pd.ExcelWriter('atest.xlsx', engine='openpyxl', mode='w')
d1.to_excel(writer,sheet_name='d1')
writer.save()
writer.close()

writer = pd.ExcelWriter('atest.xlsx', engine='openpyxl', mode='a')
writer.book = load_workbook('atest.xlsx')
d2.to_excel(writer,sheet_name='d2')
writer.save()
writer.close()
Sign up to request clarification or add additional context in comments.

4 Comments

I tried this but the result is comparable to xlsxwriter.readthedocs.io/example_pandas_multiple.html. It's a different engine but the workbook is opened only once then both tabs are written. I need to open/write/close then open/write/close. I tried your suggestion by putting d1.to_excel() and d2.to_excel() in separate with loops but the result was a single XLSX with a single tab. I also tried your suggestion but with setting mode='w'. It worked the same as mode='a'.
@enter_display_name_here why do you need to open/write/close multiple times?
I am building an extremely large dictionary of DataFrames and would like to write out each DataFrame as a tab as they get completed. I would hate to get 100+ DataFrames in and then the program experiences an issue and nothing is written out. Basically, I am scraping 4 websites, compiling different tables of data into a single DF, then doing that process over and over again. I'd like to generate an individual tab per DF as it goes through the program.
Thanks for taking the time to put that together. Your update is actually the workaround to what was done prior to the addition of mode='a'. If you refer to line 1009 (here: github.com/pandas-dev/pandas/pull/21251/files/…), you'll notice that they added book = load_workbook(self.path) to their code. Your code works the same as with mode='w'. If I am unable to get mode='a' to work, then I will use this. I am hoping that their new code will check for an existing tab and then overwrite that.
2

This worked for me, it creates a file if the file does not exists, and append to the end of the file if it already exists

you may need to install openpyxl

import pandas as pd
from openpyxl import load_workbook

def append_xlsx(df, file = 'results.xlsx'):
    ext = '.xlsx'
    if ext not in file:
        file+=ext
    
    if os.path.exists(file):
        mode="a"
        if_sheet_exists="overlay"
        header = False

        wb = load_workbook(file)
        sheet = wb.worksheets[0]
        startrow = sheet.max_row
    else:
        mode='w'
        if_sheet_exists = None
        header = True
        startrow = 0

    with pd.ExcelWriter(
        file,
        mode=mode,
        engine="openpyxl",
        if_sheet_exists=if_sheet_exists,
    ) as writer:

        df.to_excel(
            writer, 
            sheet_name="Sheet1",
            startrow=startrow,
            header=header,
            index=False,
            encoding='utf8'
        )

Comments

1

import pandas as pd

writer = pd.ExcelWriter(wk_path + save_file)
# ....
# build sc_files DataFrame and save. sc_files includes
# a column called OS.

sc_file.to_excel(writer, sheet_name='test')

# build data frame of OS counts out of sc_file
counts_os = sc_file.OS.value_counts() 

# To append to 'test' sheet, use startcol=x1, startrow=y
# To append counts_OS to the end of the current 'test' sheet
y = len(sc_file)
y += 1
counts_os.to_excel(writer, sheet_name='test', 
    startcol=1, startrow=y)

# write counts_os to sheet test2 
counts_os.to_excel(writer, sheet_name='test2')
writer.save()
writer.close()

1 Comment

Just re-read and see that I answered how to append to the same tab but the question is about multiple tabs. I have a program similar to yours that writes multiple tabs successfully. I don't specify engine='openpyxl' and I don't do a writer.save() until the very end of the program - last step. Works though I could see limits if my data was too large.
0

I submitted a post on GitHub and received a response from the contributors (see the highlighted portion below). It turns out that this functionality won't be released until 0.24 so it is not available in 0.23.1. FYI - I downloaded the RC and successfully tried out the mode='a'option. However, there may be a bug with workbooks that do not exist; I receive FileNotFoundError: [Errno 2] No such file or directory: 'test.xlsx'.

"this feature is being released as part of 0.24 which we just issued a release candidate for over the past few days. You can try on the RC or here on master and if neither works open an issue per the contributing guide, but this wouldn't be expected to work on versions older than that"

3 Comments

did you create the blank test.xlsx file first
Yep, I tried that and it worked. I also just double-checked their code. At first, I thought a new workbook would be created if one doesn't exist, but I think that's incorrect. I think the code creates a new workbook if mode= is not 'a', not if a workbook does not exist.
After looking at the code that is correct, a file is not created when mode=='a' they are just using load_workbook, which means the file needs to exist prior to running ExcelWriter with the param mode='a' if mode != a then they are creating a new workbook: self.book = Workbook()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.