0

I am trying to add rows to my pandas dataframe as such:

import pandas as pd
import datetime as dt

d={'datetime':[dt.datetime(2018,3,1,0,0),dt.datetime(2018,3,1,0,10),dt.datetime(2018,3,1,0,40)],
  'value':[4.,5.,1.]}

df=pd.DataFrame(d)

Which outputs:

             datetime  value
0 2018-03-01 00:00:00    4.0
1 2018-03-01 00:10:00    5.0
2 2018-03-01 00:40:00    1.0

What I want to do is add rows from 00:00:00 to 00:40:00, to show every 5 minutes. My desired output looks like this:

             datetime  value
0 2018-03-01 00:00:00    4.0
1 2018-03-01 00:05:00    NaN
2 2018-03-01 00:10:00    5.0
3 2018-03-01 00:15:00    NaN
4 2018-03-01 00:20:00    NaN
5 2018-03-01 00:25:00    NaN
6 2018-03-01 00:30:00    NaN
7 2018-03-01 00:35:00    NaN
8 2018-03-01 00:40:00    1.0

How do I get there?

2 Answers 2

2

You can use pd.DataFrame.resample:

df = df.resample('5Min', on='datetime').first()\
       .drop('datetime', 1).reset_index()

print(df)

             datetime  value
0 2018-03-01 00:00:00    4.0
1 2018-03-01 00:05:00    NaN
2 2018-03-01 00:10:00    5.0
3 2018-03-01 00:15:00    NaN
4 2018-03-01 00:20:00    NaN
5 2018-03-01 00:25:00    NaN
6 2018-03-01 00:30:00    NaN
7 2018-03-01 00:35:00    NaN
8 2018-03-01 00:40:00    1.0
Sign up to request clarification or add additional context in comments.

1 Comment

Thank. You. So. Much. This probably saved me a couple of hours.
0

First, you can create a dataframe including your final datetime index and then affect the second one :

df1 = pd.DataFrame({'value': np.nan} ,index=pd.date_range('2018-03-01 00:00:00', 
                     periods=9, freq='5min'))

print(df)
#Output :
                   value
2018-03-01 00:00:00 NaN
2018-03-01 00:05:00 NaN
2018-03-01 00:10:00 NaN
2018-03-01 00:15:00 NaN
2018-03-01 00:20:00 NaN
2018-03-01 00:25:00 NaN
2018-03-01 00:30:00 NaN
2018-03-01 00:35:00 NaN
2018-03-01 00:40:00 NaN

Now, let's say your dataframe is the second one, you can add this to your above code :

d={'datetime': 
[dt.datetime(2018,3,1,0,0),dt.datetime(2018,3,1,0,10),dt.datetime(2018,3,1,0,40)],
'value':[4.,5.,1.]}

df2=pd.DataFrame(d)
df2.datetime = pd.to_datetime(df2.datetime)
df2.set_index('datetime',inplace=True)
print(df2)

#Output
                   value
datetime    
2018-03-01 00:00:00 4.0
2018-03-01 00:10:00 5.0
2018-03-01 00:40:00 1.0

Finally :

df1.value = df2.value
print(df1)

#output
                   value
2018-03-01 00:00:00 4.0
2018-03-01 00:05:00 NaN
2018-03-01 00:10:00 5.0
2018-03-01 00:15:00 NaN
2018-03-01 00:20:00 NaN
2018-03-01 00:25:00 NaN
2018-03-01 00:30:00 NaN
2018-03-01 00:35:00 NaN
2018-03-01 00:40:00 1.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.