1

Background: Apparently Google doesn't have a straight answer to a very basic question, so here goes...

I have a pandas df with a Open Date column [Dtype = object] which (when previewing df) is formatted yyyy-mm-dd, which is the format I want, great! Not so great however, when I write df to a .csv which then defaults the formatting to m/dd/yyyy.

Issue: I have tried just about everything for the .csv to output yyyy-dd-mm to no avail.

What I've tried:

I have tried specifying a date format when writing the .csv

df.to_csv(filename, date_format="%Y%d%d")

I have tried changing the format of the column in question, prior to writing to a .csv

df['Open Date'] = pd.to_datetime(df['Open Date'])

I have also tried converting the column to a string, to try and force the correct output

df['Open Date'] = df['timestamp'].apply(lambda v: str(v))

Despite these attempts, I still get a m/dd/yyyy output.

Help: where am I embarrasingly going wrong here?

3
  • First, how did you check this? Did you look at the file in a text editor, or (as I suspect) did you just open it in Excel? Remember that, when Excel finds a date, it does its OWN formatting. You have to set the formatting you want Excel to use. Commented Mar 22, 2022 at 5:23
  • Most the time I've been opening the file in Excel and in Notepad++. Per the first attempt, I tried forcing an Excel format through df.to_csv(filename, date_format="%Y-%m-%d") but no luck. Am I approaching this the wrong way? Commented Mar 22, 2022 at 5:27
  • You say "when previewing" - previewing how exactly? If the datatype of the column is 'object', it seems likely it's actually a string, so perhaps your method previewing it parses the string that's in the format you don't like. At any rate, "%Y%d%d" would never produce "%Y-%m-%d" output of course. Commented Mar 22, 2022 at 5:42

1 Answer 1

4

Your question contained various breaking typos which seems to suggest what may be causing the problem in general.

There's a few issues with what you are saying. Consider:

from pandas import DataFrame
from datetime import datetime

# just some example data, including some datetime and string data
data = [
    {'Open date': datetime(2022, 3, 22, 0, 0), 'value': '1'},
    {'Open date': datetime(2022, 3, 22, 0, 1), 'value': '2'},
    {'Open date': datetime(2022, 3, 22, 0, 2), 'value': '3'}
]
df = DataFrame(data)

# note how the 'Open date' columns is actually a `datetime64[ns]`
# the 'value' string however is what you're saying you're getting, `object`
print(df['Open date'].dtype, df['value'].dtype)

# saving with a silly format, to show it works:
df.to_csv('test.csv', date_format='%Y.%m.%d')

The resulting file:

,Open date,value
0,2022.03.22,1
1,2022.03.22,2
2,2022.03.22,3

I picked a silly format because the default format for me is actually %Y-%m-%d .

The most likely issue is that your 'date' column is actually a string column, but the tools you are using to 'preview' your data are interpreting these strings as dates and actually showing them in some other format.

However, with the limited information you provided, it's guesswork. If you provide some example data that demonstrates the issue, it would be easier to say for sure.

Sign up to request clarification or add additional context in comments.

1 Comment

Your answer gets the tick, because it got me thinking - I ran intl.cpl from Windows Search box and realized that firstly my Regional Settings were defaulting all short date values to a format other than what I want. Secondly, I took another look in Notepad++ and noticed the format perfectly align with my df.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.