18

I want to convert the DatetimeIndex in my DataFrame to float format,which can be analysed in my model.Could someone tell me how to do it? Do I need to use date2num()function? Many thanks!

0

5 Answers 5

18

Convert to Timedelta and extract the total seconds from dt.total_seconds:

df

        date
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
4 2013-01-05
5 2013-01-06
6 2013-01-07
7 2013-01-08
8 2013-01-09
9 2013-01-10

pd.to_timedelta(df.date).dt.total_seconds()

0    1.356998e+09
1    1.357085e+09
2    1.357171e+09
3    1.357258e+09
4    1.357344e+09
5    1.357430e+09
6    1.357517e+09
7    1.357603e+09
8    1.357690e+09
9    1.357776e+09
Name: date, dtype: float64

Or, maybe, the data would be more useful presented as an int type:

pd.to_timedelta(df.date).dt.total_seconds().astype(int)

0    1356998400
1    1357084800
2    1357171200
3    1357257600
4    1357344000
5    1357430400
6    1357516800
7    1357603200
8    1357689600
9    1357776000
Name: date, dtype: int64
Sign up to request clarification or add additional context in comments.

5 Comments

Try df.date.values.astype(float) once
@Bharathshetty cannot astype a datetimelike from [datetime64[ns]] to [float64]
I think you got a wrong solution try pd.to_datetime(pd.to_timedelta(df.date).dt.total_seconds().values[0]) Its giving 1970 ...
@Bharathshetty that's just how the function works. it doesn't understand that the number is the epochs. The solution isn't wrong. You should understand that the epoch time of 1970 is 0, that's when the Unix OS was developed at bell labs - hence the name "Unix Timestamp".
I just thought op wanted the float representation of the datetime . I dont know what OP wants in reality. Lets see when he come back
11

Use astype float i.e if you have a dataframe like

df = pd.DataFrame({'date': ['1998-03-01 00:00:01', '2001-04-01 00:00:01','1998-06-01 00:00:01','2001-08-01 00:00:01','2001-05-03 00:00:01','1994-03-01 00:00:01'] })
df['date'] = pd.to_datetime(df['date'])
df['x'] = list('abcdef')
df = df.set_index('date')

Then

df.index.values.astype(float)

array([  8.88710401e+17,   9.86083201e+17,   8.96659201e+17,
     9.96624001e+17,   9.88848001e+17,   7.62480001e+17])

pd.to_datetime(df.index.values.astype(float))

DatetimeIndex(['1998-03-01 00:00:01', '2001-04-01 00:00:01',
           '1998-06-01 00:00:01', '2001-08-01 00:00:01',
           '2001-05-03 00:00:01', '1994-03-01 00:00:01'],
          dtype='datetime64[ns]', freq=None)

8 Comments

Note that seconds since the epoch as of 2017 are of the order 10e9, so 10e17 is incorrect. See stackoverflow.com/a/46502880/4909087 and run stackoverflow.com/questions/4548684/…
But when you convert it back to pd.to_datetime original date is returned na
Yes, but I presume OP wants to work with the epoch time. I don't know what astype gives, but it seems like a bug? It's definitely not the epoch time.
I get AttributeError when I use timedelta
Oh, sorry. I started off with a datetime column. Let me modify.
|
8

I found another solution:

df['date'] = df['date'].astype('datetime64').astype(int).astype(float)

3 Comments

I've checked it and it works for me. Can u say more about your problem? For me df['date'] has dtype: object, because I read it from csv. Maybe this is the difference. U can try this: df['date'].astype(int).astype(float)
If you are storing datetime.date objects in your column, conversion directly to float will fail. Date objects may be converted to datetime64 in order to get the resolution required for a numeric representation, but these may not be converted to floating-point values, so the intermediate step of converting to int is necessary.
Using astype(int) raises warning, it suggests to use .view(int): flatten_df['first_year_date'].astype('datetime64').view(int).astype(float)
6

I believe this offers another solution, here assuming a dataframe with a DatetimeIndex.

pd.to_numeric(df.index, downcast='float')
# although normally I would prefer an integer, and to coerce errors to NaN
pd.to_numeric(df.index, errors = 'coerce',downcast='integer')

Comments

1

If you only want specific parts of your DateTimeIndex, try this:

ADDITIONAL = 1
ddf_c['ts_part_numeric'] = ((ddf_c.index.dt.year * (10000 * ADDITIONAL)) + (ddf_c.index.dt.month * (100 * ADDITIONAL)) + ((ddf_c.index.dt.day) * ADDITIONAL))

Output is

20190523
20190524

Could adjust it to your time resolution needed.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.