2

I would like to use some daily data in one dataframe as a qualifier to run some code in another dataframe. Both dataframes contain ['Date', 'Time', 'Ticker', 'Open', 'High', 'Low', 'Close']. One dataframe has only daily information, the other contains 5min out of the same fields, here are some examples.

print(df)

       Date     Time Ticker     Open     High      Low    Close
0  01/02/18  3:00 PM     ES  2687.00  2696.00  2681.75  2695.75
1  01/03/18  3:00 PM     ES  2697.25  2714.25  2697.00  2712.50
2  01/04/18  3:00 PM     ES  2719.25  2729.00  2718.25  2724.00
3  01/05/18  3:00 PM     ES  2732.25  2743.00  2726.50  2741.25
4  01/08/18  3:00 PM     ES  2740.25  2748.50  2737.00  2746.50
5  01/09/18  3:00 PM     ES  2751.00  2760.00  2748.00  2753.00
6  01/10/18  3:00 PM     ES  2744.00  2751.75  2736.50  2748.75
7  01/11/18  3:00 PM     ES  2754.25  2768.50  2752.75  2768.00
8  01/12/18  3:00 PM     ES  2771.25  2788.75  2770.00  2786.50
9  01/15/18  3:00 PM     ES  2793.75  2796.00  2792.50  2794.50

print(df_tick)

           Date      Time Ticker     Open     High      Low    Close
0      01/02/18   8:45 AM     ES  2687.00  2687.25  2681.75  2685.75
1      01/02/18   9:00 AM     ES  2686.00  2687.75  2683.50  2687.50
2      01/02/18   9:15 AM     ES  2687.50  2690.50  2687.25  2689.25
3      01/02/18   9:30 AM     ES  2689.50  2692.00  2689.25  2692.00
4      01/02/18   9:45 AM     ES  2692.00  2692.25  2687.25  2690.00
5      01/02/18  10:00 AM     ES  2690.00  2691.00  2689.75  2690.75
6      01/02/18  10:15 AM     ES  2690.50  2691.25  2690.25  2691.00
7      01/02/18  10:30 AM     ES  2691.00  2692.00  2689.00  2689.50
8      01/02/18  10:45 AM     ES  2689.50  2689.75  2687.75  2688.25
9      01/02/18  11:00 AM     ES  2688.25  2689.50  2687.75  2689.25
10     01/02/18  11:15 AM     ES  2689.25  2690.75  2689.25  2690.00
11     01/02/18  11:30 AM     ES  2690.00  2690.75  2689.25  2690.00
12     01/02/18  11:45 AM     ES  2690.25  2690.50  2688.50  2688.75
13     01/02/18  12:00 PM     ES  2689.00  2689.25  2688.50  2689.25
14     01/02/18  12:15 PM     ES  2689.25  2691.00  2689.00  2690.50
15     01/02/18  12:30 PM     ES  2690.75  2691.00  2689.75  2690.50
16     01/02/18  12:45 PM     ES  2690.75  2691.25  2690.25  2691.00
17     01/02/18   1:00 PM     ES  2691.25  2691.25  2689.50  2690.75
18     01/02/18   1:15 PM     ES  2690.50  2691.50  2690.25  2690.50
19     01/02/18   1:30 PM     ES  2690.50  2691.00  2689.75  2690.75
20     01/02/18   1:45 PM     ES  2690.75  2691.50  2690.25  2690.75
21     01/02/18   2:00 PM     ES  2690.75  2691.25  2690.75  2691.00
22     01/02/18   2:15 PM     ES  2691.25  2691.75  2690.50  2691.50
23     01/02/18   2:30 PM     ES  2691.50  2693.00  2691.50  2692.75
24     01/02/18   2:45 PM     ES  2693.00  2693.75  2691.00  2693.75
25     01/02/18   3:00 PM     ES  2693.75  2696.00  2693.25  2695.75
26     01/03/18   8:45 AM     ES  2697.25  2702.25  2697.00  2700.75
27     01/03/18   9:00 AM     ES  2701.00  2703.75  2700.50  2703.25
28     01/03/18   9:15 AM     ES  2703.25  2706.00  2703.00  2705.00
29     01/03/18   9:30 AM     ES  2705.00  2707.25  2704.00  2706.50

Code for calculating the gap percentage

#Calculating Gap Percentage
df['Gap %'] = (df['Open'].sub(df['Close'].shift()).div(df['Close'] - 
1).fillna(0))*100

I have the code for the df to find the percentage change from Close-Open, and would like to use this information as a qualifier to run some code on the df_tick.

For example if df['Gap %'] > .02, then I want to use that date in df_tick and ignore (or drop) the rest of the information.

#drop rows not meeting certain percentage
df.drop(df[df['Gap %'] < .2].index, inplace=True)

print(df)

       Date     Time Ticker     Open    High      Low    Close   Gap     Gap %
2  01/04/18  3:00 PM     ES  2719.25  2729.0  2718.25  2724.00  6.75  0.247888
3  01/05/18  3:00 PM     ES  2732.25  2743.0  2726.50  2741.25  8.25  0.301067
9  01/15/18  3:00 PM     ES  2793.75  2796.0  2792.50  2794.50  7.25  0.259531

Now I'd like to use df['Date'] to find the matching Dates in df_tick['Date'] for some code I've already written, I tried to just drop all the data where the dates aren't the same. But received an error.

#drop rows in df_tick not matching dates in df
df_tick.drop(df_tick[df_tick['Date'] != df['Date']].index, inplace=True)

ValueError: Can only compare identically-labeled Series objects

1 Answer 1

1

You may be able to reset the index of both dataframes and get away with what you are trying to do, but I would try this:

df_tick = df_tick[df_tick.Date.isin(df.Date.unique())]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.