How to convert Pandas Series of dates string into date objects?

Question

I have a Pandas data frame and one of the columns, have dates in string format of YYYY-MM-DD.

For e.g. : '2013-10-28'

At the moment the dtype of the column is object.

How do I convert the column values to Pandas date format?

Pandas has a native DATETIME type (datetime64); it doesn't have a native DATE dtype (any column containing DATE objects will be object dtype). It's much faster to work with datetime64 instead of dtype=object column of date objects. You can see here that something as simple as adding timedelta to timestamps is 100x faster on datetime64. If you really need DATE objects (e.g. need to dump the data into a database), then it's probably better to work on datetime64 while in pandas and convert to dates before storing the data elsewhere. — cottontail
– cottontail, Commented Apr 15, 2024 at 20:32

Trenton McKinney · Accepted Answer · 2021-05-18 18:44:45Z

160

Essentially equivalent to @waitingkuo, but I would use pd.to_datetime here (it seems a little cleaner, and offers some additional functionality e.g. dayfirst):

In [11]: df
Out[11]:
   a        time
0  1  2013-01-01
1  2  2013-01-02
2  3  2013-01-03

In [12]: pd.to_datetime(df['time'])
Out[12]:
0   2013-01-01 00:00:00
1   2013-01-02 00:00:00
2   2013-01-03 00:00:00
Name: time, dtype: datetime64[ns]

In [13]: df['time'] = pd.to_datetime(df['time'])

In [14]: df
Out[14]:
   a                time
0  1 2013-01-01 00:00:00
1  2 2013-01-02 00:00:00
2  3 2013-01-03 00:00:00

Handling ValueErrors
If you run into a situation where doing

df['time'] = pd.to_datetime(df['time'])

Throws a

ValueError: Unknown string format

That means you have invalid (non-coercible) values. If you are okay with having them converted to pd.NaT, you can add an errors='coerce' argument to to_datetime:

df['time'] = pd.to_datetime(df['time'], errors='coerce')

edited May 18, 2021 at 18:44

Trenton McKinney

63.2k41 gold badges169 silver badges212 bronze badges

answered May 31, 2013 at 9:46

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

yoshiserry Over a year ago

Hi Guys, @AndyHayden can you remove the time part from the date? I don't need that part?

Andy Hayden Over a year ago

In pandas' 0.13.1 the trailing 00:00:00s aren't displayed.

yoshiserry Over a year ago

and what about in other versions, how do we remove / and or not display them?

Andy Hayden Over a year ago

I don't think this can be done in a nice way, there is discussion to add date_format like float_format (which you've seen). I recommend upgrading anyway.

yoshiserry Over a year ago

my problem is my date is in this format... 41516.43, and I get this error. I would expect it to return something like 2014-02-03 in the new column?! THE ERROR: #convert date values in the "load_date" column to dates budget_dataset['date_last_load'] = pd.to_datetime(budget_dataset['load_date']) budget_dataset -c:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead

|

waitingkuo · Accepted Answer · 2013-05-31 08:36:33Z

138

Use astype

In [31]: df
Out[31]: 
   a        time
0  1  2013-01-01
1  2  2013-01-02
2  3  2013-01-03

In [32]: df['time'] = df['time'].astype('datetime64[ns]')

In [33]: df
Out[33]: 
   a                time
0  1 2013-01-01 00:00:00
1  2 2013-01-02 00:00:00
2  3 2013-01-03 00:00:00

answered May 31, 2013 at 8:36

waitingkuo

94.5k28 gold badges119 silver badges122 bronze badges

7 Comments

user7289 Over a year ago

Nice - thank you - how do I get rid of the 00:00:00 at the end of each date?

waitingkuo Over a year ago

The pandas timestamp have both date and time. Do you mean convert it into python date object?

waitingkuo Over a year ago

You can convert it by df['time'] = [time.date() for time in df['time']]

yoshiserry Over a year ago

what does the [ns] mean, can you make the text string a date and remove the time part of that date?

Andy Hayden Over a year ago

@yoshiserry it's nanoseconds, and is the way the dates are stored under the hood once converted properly (epoch-time in nanoseconds).

|

fantabolous · Accepted Answer · 2023-05-05 05:12:51Z

47

I imagine a lot of data comes into Pandas from CSV files, in which case you can simply convert the date during the initial CSV read:

dfcsv = pd.read_csv('xyz.csv', parse_dates=[0]) where the 0 refers to the column the date is in.
You could also add , index_col=0 in there if you want the date to be your index.

See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

You can also select column(s) to parse by name rather than position, e.g. parse_dates=['thedate']

edited May 5, 2023 at 5:12

answered Mar 19, 2014 at 4:15

fantabolous

22.9k8 gold badges58 silver badges52 bronze badges

1 Comment

Sastibe Over a year ago

Thanks, that was exactly what I needed. The documentation has moved, though, you can find it here: pandas.pydata.org/pandas-docs/stable/reference/api/…

szeitlin · Accepted Answer · 2015-11-07 00:22:59Z

30

Now you can do df['column'].dt.date

Note that for datetime objects, if you don't see the hour when they're all 00:00:00, that's not pandas. That's iPython notebook trying to make things look pretty.

answered Nov 7, 2015 at 0:22

szeitlin

3,3602 gold badges25 silver badges19 bronze badges

4 Comments

smishra Over a year ago

This one does not work for me, it complains: Can only use .dt accessor with datetimelike values

szeitlin Over a year ago

you may have to do df[col] = pd.to_datetime(df[col]) first to convert your column to date time objects.

elPastor Over a year ago

The issue with this answer is that it converts the column to dtype = object which takes up considerably more memory than a true datetime dtype in pandas.

szeitlin Over a year ago

This person didn't say anything about the size of the dataset, or about parquet.

David Valenzuela Urrutia · Accepted Answer · 2019-12-06 19:50:15Z

13

If you want to get the DATE and not DATETIME format:

df["id_date"] = pd.to_datetime(df["id_date"]).dt.date

answered Dec 6, 2019 at 19:50

David Valenzuela Urrutia

7581 gold badge10 silver badges24 bronze badges

2 Comments

sq89 Over a year ago

I agree with @Asclepius. I've had a lot more issues with using .dt.date than just converting to datetime first.

Dinesh vishe Over a year ago

we are getting following error: could not convert string to float: '12/12/2010'. How to solve it ?

SSS · Accepted Answer · 2019-04-29 14:18:46Z

7

Another way to do this and this works well if you have multiple columns to convert to datetime.

cols = ['date1','date2']
df[cols] = df[cols].apply(pd.to_datetime)

answered Apr 29, 2019 at 14:18

SSS

3001 gold badge4 silver badges10 bronze badges

3 Comments

Mark Andersen Over a year ago

Question ask for date not datetime.

Sumax Over a year ago

@MarkAndersen aslong as you have date only values in your columns, convertion to datetime will retain pertaining information only. If you explicity convert using df['datetime_col'].dt.date that will result to an object dtype; loss in memory management.

Asclepius Over a year ago

There is no reason to use .apply here, considering a direct use of pd.to_datetime works.

Ted M. · Accepted Answer · 2017-10-17 21:30:24Z

It may be the case that dates need to be converted to a different frequency. In this case, I would suggest setting an index by dates.

#set an index by dates
df.set_index(['time'], drop=True, inplace=True)

After this, you can more easily convert to the type of date format you will need most. Below, I sequentially convert to a number of date formats, ultimately ending up with a set of daily dates at the beginning of the month.

#Convert to daily dates
df.index = pd.DatetimeIndex(data=df.index)

#Convert to monthly dates
df.index = df.index.to_period(freq='M')

#Convert to strings
df.index = df.index.strftime('%Y-%m')

#Convert to daily dates
df.index = pd.DatetimeIndex(data=df.index)

For brevity, I don't show that I run the following code after each line above:

print(df.index)
print(df.index.dtype)
print(type(df.index))

This gives me the following output:

Index(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='object', name='time')
object
<class 'pandas.core.indexes.base.Index'>

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='datetime64[ns]', name='time', freq=None)
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

PeriodIndex(['2013-01', '2013-01', '2013-01'], dtype='period[M]', name='time', freq='M')
period[M]
<class 'pandas.core.indexes.period.PeriodIndex'>

Index(['2013-01', '2013-01', '2013-01'], dtype='object')
object
<class 'pandas.core.indexes.base.Index'>

DatetimeIndex(['2013-01-01', '2013-01-01', '2013-01-01'], dtype='datetime64[ns]', freq=None)
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

rubebop · Accepted Answer · 2020-08-16 02:13:27Z

1

For the sake of completeness, another option, which might not be the most straightforward one, a bit similar to the one proposed by @SSS, but using rather the datetime library is:

import datetime
df["Date"] = df["Date"].apply(lambda x: datetime.datetime.strptime(x, '%Y-%d-%m').date())

answered Aug 16, 2020 at 2:13

rubebop

4841 gold badge10 silver badges20 bronze badges

Comments

donDrey · Accepted Answer · 2020-06-24 03:35:29Z

0

 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   startDay        110526 non-null  object
 1   endDay          110526 non-null  object

import pandas as pd

df['startDay'] = pd.to_datetime(df.startDay)

df['endDay'] = pd.to_datetime(df.endDay)

 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   startDay        110526 non-null  datetime64[ns]
 1   endDay          110526 non-null  datetime64[ns]

edited Jun 24, 2020 at 3:35

answered Jun 24, 2020 at 3:28

donDrey

414 bronze badges

1 Comment

smci Over a year ago

No, this converts it to a 'datetime64[ns]' type not a 'date' type. Those are different things.

Mwanaidi Nicole · Accepted Answer · 2020-02-13 16:59:47Z

-1

Try to convert one of the rows into timestamp using the pd.to_datetime function and then use .map to map the formular to the entire column

answered Feb 13, 2020 at 16:59

Mwanaidi Nicole

11 bronze badge

Comments

shubham maini · Accepted Answer · 2023-04-11 04:36:37Z

-1

You can use the pandas.to_datetime()

answered Apr 11, 2023 at 4:36

shubham maini

176 bronze badges

Collectives™ on Stack Overflow

How to convert Pandas Series of dates string into date objects?

11 Answers 11

10 Comments

7 Comments

1 Comment

4 Comments

2 Comments

3 Comments

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

10 Comments

7 Comments

1 Comment

4 Comments

2 Comments

3 Comments

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related