3

Current df:

Date                 Power
2011-04-18 17:00:00  243.56
2011-04-18 17:00:01  245.83
2011-04-18 17:00:02  246.02
2011-04-18 17:00:03  245.72
2011-04-18 17:00:04  244.71
2011-04-18 17:00:05  245.93
2011-04-18 17:00:06  243.12
2011-04-18 17:00:07  244.72
2011-04-18 17:00:08  242.44
2011-04-18 17:00:09  246.42
2011-04-18 17:00:10  245.02
...                     ...

I have the df with Date and a float number. Date is the index and is unique. I would like to create a new df based on the dates found in the next df.

     date start          date end
0    2011-04-18 17:00:01 2011-04-18 17:00:02
1    2011-04-18 17:00:05 2011-04-18 17:00:06
2    2011-04-18 17:00:08 2011-04-18 17:00:10
...                  ...                 ...

I expect to get:

Date                 Power
2011-04-18 17:00:01  245.83
2011-04-18 17:00:02  246.02
2011-04-18 17:00:05  245.93
2011-04-18 17:00:06  243.12
2011-04-18 17:00:08  242.44
2011-04-18 17:00:09  246.42
2011-04-18 17:00:10  245.02
...                     ...

In other word I want to filter the initial df and find all rows between all the dates found in the second df.

I thought of using pandas.DataFrame.between_time. But the issue is this works only for 1 given date start and date end. How can I do this with many different date periods?

1 Answer 1

2

Use np.logical_or.reduce with list comprehension:

L = [df1['Date'].between(s, e) for s, e in df2[['date start','date end']].to_numpy()]

df = df1[np.logical_or.reduce(L)]
print (df)
                  Date   Power
1  2011-04-18 17:00:01  245.83
2  2011-04-18 17:00:02  246.02
5  2011-04-18 17:00:05  245.93
6  2011-04-18 17:00:06  243.12
8  2011-04-18 17:00:08  242.44
9  2011-04-18 17:00:09  246.42
10 2011-04-18 17:00:10  245.02

If DatetimeIndex is possible use:

L = [df1[s:e] for s, e in df2[['date start','date end']].to_numpy()]

df = pd.concat(L)
print (df)
                      Power
Date                       
2011-04-18 17:00:01  245.83
2011-04-18 17:00:02  246.02
2011-04-18 17:00:05  245.93
2011-04-18 17:00:06  243.12
2011-04-18 17:00:08  242.44
2011-04-18 17:00:09  246.42
2011-04-18 17:00:10  245.02



L = [(df1.index >= s) & (df1.index <= e) 
     for s, e in df2[['date start','date end']].to_numpy()]

df = df1[np.logical_or.reduce(L)]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.