I want to convert the DatetimeIndex in my DataFrame to float format,which can be analysed in my model.Could someone tell me how to do it? Do I need to use date2num()function? Many thanks!
5 Answers
Convert to Timedelta and extract the total seconds from dt.total_seconds:
df
date
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
4 2013-01-05
5 2013-01-06
6 2013-01-07
7 2013-01-08
8 2013-01-09
9 2013-01-10
pd.to_timedelta(df.date).dt.total_seconds()
0 1.356998e+09
1 1.357085e+09
2 1.357171e+09
3 1.357258e+09
4 1.357344e+09
5 1.357430e+09
6 1.357517e+09
7 1.357603e+09
8 1.357690e+09
9 1.357776e+09
Name: date, dtype: float64
Or, maybe, the data would be more useful presented as an int type:
pd.to_timedelta(df.date).dt.total_seconds().astype(int)
0 1356998400
1 1357084800
2 1357171200
3 1357257600
4 1357344000
5 1357430400
6 1357516800
7 1357603200
8 1357689600
9 1357776000
Name: date, dtype: int64
5 Comments
Bharath M Shetty
Try df.date.values.astype(float) once
cs95
@Bharathshetty
cannot astype a datetimelike from [datetime64[ns]] to [float64]Bharath M Shetty
I think you got a wrong solution try
pd.to_datetime(pd.to_timedelta(df.date).dt.total_seconds().values[0]) Its giving 1970 ...cs95
@Bharathshetty that's just how the function works. it doesn't understand that the number is the epochs. The solution isn't wrong. You should understand that the epoch time of 1970 is 0, that's when the Unix OS was developed at bell labs - hence the name "Unix Timestamp".
Bharath M Shetty
I just thought op wanted the float representation of the datetime . I dont know what OP wants in reality. Lets see when he come back
Use astype float i.e if you have a dataframe like
df = pd.DataFrame({'date': ['1998-03-01 00:00:01', '2001-04-01 00:00:01','1998-06-01 00:00:01','2001-08-01 00:00:01','2001-05-03 00:00:01','1994-03-01 00:00:01'] })
df['date'] = pd.to_datetime(df['date'])
df['x'] = list('abcdef')
df = df.set_index('date')
Then
df.index.values.astype(float)
array([ 8.88710401e+17, 9.86083201e+17, 8.96659201e+17,
9.96624001e+17, 9.88848001e+17, 7.62480001e+17])
pd.to_datetime(df.index.values.astype(float))
DatetimeIndex(['1998-03-01 00:00:01', '2001-04-01 00:00:01',
'1998-06-01 00:00:01', '2001-08-01 00:00:01',
'2001-05-03 00:00:01', '1994-03-01 00:00:01'],
dtype='datetime64[ns]', freq=None)
8 Comments
cs95
Note that seconds since the epoch as of 2017 are of the order 10e9, so 10e17 is incorrect. See stackoverflow.com/a/46502880/4909087 and run stackoverflow.com/questions/4548684/…
Bharath M Shetty
But when you convert it back to pd.to_datetime original date is returned na
cs95
Yes, but I presume OP wants to work with the epoch time. I don't know what astype gives, but it seems like a bug? It's definitely not the epoch time.
Bharath M Shetty
I get AttributeError when I use timedelta
cs95
Oh, sorry. I started off with a datetime column. Let me modify.
|
I found another solution:
df['date'] = df['date'].astype('datetime64').astype(int).astype(float)
3 Comments
Tomek Tajne
I've checked it and it works for me. Can u say more about your problem? For me df['date'] has dtype: object, because I read it from csv. Maybe this is the difference. U can try this:
df['date'].astype(int).astype(float)Rob Hall
If you are storing
datetime.date objects in your column, conversion directly to float will fail. Date objects may be converted to datetime64 in order to get the resolution required for a numeric representation, but these may not be converted to floating-point values, so the intermediate step of converting to int is necessary.Dr Fabio Gori
Using
astype(int) raises warning, it suggests to use .view(int): flatten_df['first_year_date'].astype('datetime64').view(int).astype(float)If you only want specific parts of your DateTimeIndex, try this:
ADDITIONAL = 1
ddf_c['ts_part_numeric'] = ((ddf_c.index.dt.year * (10000 * ADDITIONAL)) + (ddf_c.index.dt.month * (100 * ADDITIONAL)) + ((ddf_c.index.dt.day) * ADDITIONAL))
Output is
20190523
20190524
Could adjust it to your time resolution needed.