3

I have a web scraper which creates an excel file for this month's scrapes. I want to add today's scrape and every scrape for that month into that file as a new sheet every time it is run. My issue, however, has been that it only overwrites the existing sheet with a new sheet instead of adding it as a separate new sheet. I've tried to do it with xlrd, xlwt, pandas, and openpyxl.

Still brand new to Python so simplicity is appreciated!

Below is just the code dealing with writing the excel file.

# My relevant time variables
ts = time.time()
date_time = datetime.datetime.fromtimestamp(ts).strftime('%y-%m-%d %H_%M_%S')
HourMinuteSecond = datetime.datetime.fromtimestamp(ts).strftime('%H_%M_%S')
month = datetime.datetime.now().strftime('%m-%y')

# Creates a writer for this month and year
writer = pd.ExcelWriter(
    'C:\\Users\\G\\Desktop\\KickstarterLinks(%s).xlsx' % (month), 
    engine='xlsxwriter')

# Creates dataframe from my data, d
df = pd.DataFrame(d)

# Writes to the excel file
df.to_excel(writer, sheet_name='%s' % (HourMinuteSecond))
writer.save()
1

1 Answer 1

4

Update:

This functionality has been added to pandas 0.24.0:

ExcelWriter now accepts mode as a keyword argument, enabling append to existing workbooks when using the openpyxl engine (GH3441)

Previous version:

Pandas has an open feature request for this.

In the mean time, here is a function which adds a pandas.DataFrame to an existing workbook:

Code:

def add_frame_to_workbook(filename, tabname, dataframe, timestamp):
    """
    Save a dataframe to a workbook tab with the filename and tabname
    coded to timestamp

    :param filename: filename to create, can use strptime formatting
    :param tabname: tabname to create, can use strptime formatting
    :param dataframe: dataframe to save to workbook
    :param timestamp: timestamp associated with dataframe
    :return: None
    """
    filename = timestamp.strftime(filename)
    sheet_name = timestamp.strftime(tabname)

    # create a writer for this month and year
    writer = pd.ExcelWriter(filename, engine='openpyxl')
    
    try:
        # try to open an existing workbook
        writer.book = load_workbook(filename)
        
        # copy existing sheets
        writer.sheets = dict(
            (ws.title, ws) for ws in writer.book.worksheets)
    except IOError:
        # file does not exist yet, we will create it
        pass

    # write out the new sheet
    dataframe.to_excel(writer, sheet_name=sheet_name)
    
    # save the workbook
    writer.save()

Test Code:

import datetime as dt
import pandas as pd
from openpyxl import load_workbook

data = [x.strip().split() for x in """
                   Date  Close
    2016-10-18T13:44:59  2128.00
    2016-10-18T13:59:59  2128.75
""".split('\n')[1:-1]]
df = pd.DataFrame(data=data[1:], columns=data[0])

name_template = './sample-%m-%y.xlsx'
tab_template = '%d_%H_%M'
now = dt.datetime.now()
in_an_hour = now + dt.timedelta(hours=1)
add_frame_to_workbook(name_template, tab_template, df, now)
add_frame_to_workbook(name_template, tab_template, df, in_an_hour)

(Source)

Sign up to request clarification or add additional context in comments.

2 Comments

THANK YOU! This looks like a very well written function as well. Did exactly what I needed it to. You're the man Stephen!
@Stephen Rauch - neat function. I am having a similar but different issue. I have a workbook with 9 sheets. I want to specify one sheet in said workbook and have all the data in the sheet replaced with a specific data frame. I want to specify another sheet in the workbook to append the existing data with a data frame. I have had a really hard time figuring this one out. stackoverflow.com/questions/55348234/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.