python - count occurrences above in lambda function

Question

I have a dataframe df with the columns X . I would like to create a new column Y out of X

Y should be 1 if X in the same row is 1 and the 0´s (also in X) above should be minimum a count of n (variable). If the zeros above less than n, the result should be "" Y. I tried (hours) np.where, without success. I think I need a lambda-function, but have no idea how to start or research.

Exampel n = 4:

On date 2018-01-25, result is 1 because X is 1 and the 0´s above are more than 4.

On Date 2018-01-25, result is "" because 0´s above just 3 (not 4)

 Dates        X    Y (like it should be...)
2018-01-02    0
2018-01-03    0
2018-01-04    0
2018-01-05    0
2018-01-08    0
2018-01-09    0
2018-01-10    0
2018-01-11    0
2018-01-12    0
2018-01-15    0
2018-01-16    0
2018-01-17    0
2018-01-18    0
2018-01-19    0
2018-01-22    0
2018-01-23    0
2018-01-24    0
2018-01-25    1  1
2018-01-29    0  
2018-01-30    0  
2018-01-31    0  
2018-02-02    1  
2018-02-05    0  
2018-02-06    0
2018-02-07    0
2018-02-08    0
2018-02-09    1  1
2018-02-12    1
2018-02-13    0

Can you post your expected output, so I assume that if Signal_F is 1 you want to add a new column where Signal_X = 1 but only if the four rows above are 0? so IIUC, 2018-02-01 will have 1 in Signal_X ? — Umar.H
– Umar.H, Commented Nov 8, 2019 at 21:35
no i expect 0 for 2018-02-02, because there are just three 0's above not 4. i tried to post the expecet column. but i changed the names of the columns — Alex
– Alex, Commented Nov 8, 2019 at 21:49

Umar.H · Accepted Answer · 2022-07-20 11:22:03Z

1

We can groupby a temporary column and then do apply a conditional cumsum + cumcount for some conditional matching.

s = (df.assign(var1='x').groupby('var1')['X']
            .apply(lambda x : x.ne(x.shift()).ne(0).cumsum()))
# create a temp variable.

df['Count']=df.groupby([df.X,s]).cumcount()+1 # add a Count column.

matches = df.iloc[df.loc[(df['X'] == 1)].index - 1].loc[df['Count'] >= 4].index 
# find the index matches and check if the previous row has +4 or more matches

df.loc[matches + 1,'Y'] = 1 # Create your Y column.

df.drop('Count',axis=1,inplace=True) # Drop the Count Column. 

print(df[df['Y'] == 1]) # print df
    Dates  X    Y
17  2018-01-25  1  1.0
26  2018-02-09  1  1.0

edited Jul 20, 2022 at 11:22

answered Nov 8, 2019 at 22:31

Umar.H

23.1k7 gold badges50 silver badges94 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Alex Over a year ago

hi, thank you for your solution and time. i tried the code but i got: NullFrequencyError: Cannot shift with no freq

Umar.H Over a year ago

then you have nulls in your dataframe that you need to handle before you can use the code.

Collectives™ on Stack Overflow

python - count occurrences above in lambda function

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related