Python convert a date string to datetime recognised by Excel

Question

I'm using python and pandas to query a table in SQL, store it in a DataFrame, then write it to an excel file (.xlsx).

I'm then using a couple of VBA macros to loop through the columns and do some conditional formatting to highlight outliers.

Everything works fine except the date column which excel gets stuck on and presents an error:

"Method 'Average' of object 'WorksheetFunction' failed"

The date is being stored as a string in the format '20-01-2022' which is presumably causing the error so I need to convert it to an actual datetime format that excel will recognise upon opening the file.

Example:

import pandas as pd

df = pd.DataFrame([[1, '21-06-2022'], [2, '19-08-2022'], [3, '06-04-2022']], columns=['id', 'date'])

df.to_excel("output.xlsx")

If you then open "output.xlsx" and try to use conditional formatting on the 'date' column, or try to =AVERAGE(C2:C4) either nothing happens or you get an error. If you double click into the cell, something happens and excel will suddenly recognise it, but this solution isn't suitable with thousands of cells.

How can I convert dates to a format that excel will recognise immediately upon opening the file?

Please, check How to make good reproducible pandas examples. Then post minimal reproducible example. The sql part is irrelevant as long as you create sample DF with proper column types. Most likely your date column in DF is string, not datetime object — buran
– buran, Commented Jan 25, 2023 at 16:58
Also make sure that there is indeed problem with date in excel and not some other problem with your VBA code — buran
– buran, Commented Jan 25, 2023 at 16:59

ljmc · Accepted Answer · 2023-01-25 20:36:27Z

1

Before saving your df to excel, you need to parse those ISO8601 string to dates.

There are several ways to do that.

You can use the pandas.read_sql keyword argument parse_dates to parse specific columns as dates, even specifying the format, which can parse as dates directly.

import pandas as pd

df = pd.read_sql(
    sql,
    con,
    parse_dates={
        "<col1>": {"format": "%y-%m-%d"},
        "<col2>": {"format": "%d/%m/%y"}
    },
)

Same as above, but without a format, parses columns as datetimes and then the dates can be extracted.

import pandas as pd

df = pd.read_sql(sql, con, parse_dates=["<col1>", "<col2>"])
df[["<col1>", "<col2>"]] = df[["<col1>", "<col2>"]].dt.date

You can load then parse manually with pd.to_datetime, and again extract the dates only.

import pandas as pd

df = pd.read_sql(sql, con)
df[["<col1>", "<col2>"]] = pd.to_datetime(df[["<col1>", "<col2>"]]).dt.date

Or you could also just parse with datetime.date.fromisoformat.

import pandas as pd
from datetime import date

df = pd.read_sql(sql, con)
df[["<col1>", "<col2>"]] = df[["<col1>", "<col2>"]].applymap(date.fromisoformat)

NB. no specific ordering was used, but it seems the first method is slightly faster than the others, while also being the most elegant (in my opinion).

edited Jan 25, 2023 at 20:36

answered Jan 25, 2023 at 20:16

ljmc

5,3732 gold badges11 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

S7ewie Over a year ago

Thank you! Adding parse_dates=["date"] to the pd.read_sql() worked for me. Do you know if its possible to format a datetime to display as "day-month-year" WITHOUT converting it back to a string? I imagine that's something I'll have to do in excel as its excel that decides how to display it in its own GUI?

ljmc Over a year ago

Exactly, that will be an excel formatting issue, you can probably do it in python via openpyxl, but I'm not familiar with a way to do it right if df.to_excel.

Collectives™ on Stack Overflow

Python convert a date string to datetime recognised by Excel

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related