21

I desire to append dataframe to excel

This code works nearly as desire. Though it does not append each time. I run it and it puts data-frame in excel. But each time I run it it does not append. I also hear openpyxl is cpu intensive but not hear of many workarounds.

import pandas
from openpyxl import load_workbook

book = load_workbook('C:\\OCC.xlsx')
writer = pandas.ExcelWriter('C:\\OCC.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

df1.to_excel(writer, index = False)

writer.save()

I want the data to append each time I run it, this is not happening.

Data output looks like original data:

A   B   C
H   H   H

I want after run a second time

A   B    C
H   H    H
H   H    H

Apologies if this is obvious I new to python and examples I practise did not work as wanted.

Question is - how can I append data each time I run. I try change to xlsxwriter but get AttributeError: 'Workbook' object has no attribute 'add_format'

7 Answers 7

49

first of all, this post is the first piece of the solution, where you should specify startrow=: Append existing excel sheet with new dataframe using python pandas

you might also consider header=False. so it should look like:

df1.to_excel(writer, startrow = 2,index = False, Header = False)

if you want it to automatically get to the end of the sheet and append your df then use:

startrow = writer.sheets['Sheet1'].max_row

and if you want it to go over all of the sheets in the workbook:

for sheetname in writer.sheets:
    df1.to_excel(writer,sheet_name=sheetname, startrow=writer.sheets[sheetname].max_row, index = False,header= False)

btw: for the writer.sheets you could use dictionary comprehension (I think it's more clean, but that's up to you, it produces the same output):

writer.sheets = {ws.title: ws for ws in book.worksheets}

so full code will be:

import pandas
from openpyxl import load_workbook

book = load_workbook('test.xlsx')
writer = pandas.ExcelWriter('test.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}

for sheetname in writer.sheets:
    df1.to_excel(writer,sheet_name=sheetname, startrow=writer.sheets[sheetname].max_row, index = False,header= False)

writer.save()
Sign up to request clarification or add additional context in comments.

2 Comments

This works for me in pandas 0.24.2 for Python 2 (also worked in 0.19.2 when I tried).
sheet name is throwing an error for me in the max_row line
12

All examples here are quite complicated. In the documentation, it is much easier:

def append_to_excel(fpath, df, sheet_name):
    with pd.ExcelWriter(fpath, mode="a", if_sheet_exists = 'overlay') as f:
        df.to_excel(f, sheet_name=sheet_name)

append_to_excel(<your_excel_path>, <new_df>, <new_sheet_name>)

When using this on LibreOffice/OpenOffice excel files, I get the error:

KeyError: "There is no item named 'xl/drawings/drawing1.xml' in the archive"

which is a bug in openpyxl as mentioned here.

5 Comments

This appends the new DF to a new sheet.
@JulioS. This appends a new df as a new sheet to the existing excel file.
ValueError: Append mode is not supported with xlsxwriter!
please modify code: with pd.ExcelWriter(fpath, mode="a", if_sheet_exists = 'overlay') as f:
11

You can use the append_df_to_excel() helper function, which is defined in this answer:

Usage examples:

filename = r'C:\OCC.xlsx'

append_df_to_excel(filename, df)

append_df_to_excel(filename, df, header=None, index=False)

append_df_to_excel(filename, df, sheet_name='Sheet2', index=False)

append_df_to_excel(filename, df, sheet_name='Sheet2', index=False, startrow=25)

1 Comment

This was a very helpful function and worked perfectly up until yesterday. Now using this helper function I am getting an error "Value must be a sequence". Any ideas why that might be the case?
7

I tried to read an excel, put it in a dataframe and then concat the dataframe from excel with the desired dataframe. It worked for me.

def append_df_to_excel(df, excel_path):
    df_excel = pd.read_excel(excel_path)
    result = pd.concat([df_excel, df], ignore_index=True)
    result.to_excel(excel_path, index=False)

df = pd.DataFrame({"a":[11,22,33], "b":[55,66,77]})
append_df_to_excel(df, r"<path_to_dir>\<out_name>.xlsx")

3 Comments

The pf.concat was just the thing I was looking for. Simple and effective. Well done Victor.
Perfect answer: so simple! Thank you!
this deleted my existing sheet1
1

If someone need it, I found an easier way:

Convert DF to rows in a list
rows = your_df.values.tolist()
load your workbook
workbook = load_workbook(filename=your_excel)
Pick your sheet
sheet = workbook[your_sheet]
Iterate over rows to append each:
for row in rows:
    sheet.append(row)
Save woorkbook when done
workbook.save(filename=your_excel)
Putting it all together:
rows = your_df.values.tolist()
workbook = load_workbook(filename=your_excel)
sheet = workbook[your_sheet]
for row in rows:
    sheet.append(row)
workbook.save(filename=your_excel)

Comments

0
def append_to_excel(fpath, df):
 if (os.path.exists(fpath)):
    x=pd.read_excel(fpath)
 else :
    x=pd.DataFrame()

 dfNew=pd.concat([df,x])
 dfNew.to_excel(fpath,index=False)

2 Comments

This answer was reviewed in the Low Quality Queue. Here are some guidelines for How do I write a good answer?. Code only answers are not considered good answers, and are likely to be downvoted and/or deleted because they are less useful to a community of learners. It's only obvious to you. Explain what it does, and how it's different / better than existing answers. From Review
Please don't post code-only answers. The main audience, future readers, will be grateful to see explained why this answers the question instead of having to infer it from the code. Also, since this is an old question, please explain how it complements all other answers.
0

Why complicate things? Simply get number of rows in excel file to determine where to append with startrow parameter:

import pandas as pd
import openpyxl as xl

# Get number of rows in excel file (to determine where to append)
source_file = xl.load_workbook("file.xlsx", enumerate)
sheet = source_file["sheetname"]
row_count = sheet.max_row
source_file.close()

with pd.ExcelWriter("file.xlsx", mode='a', if_sheet_exists='overlay') as writer:  
    data.to_excel(writer, sheet_name='sheetname', index= False, startrow = row_count)

Comments