2

I am trying to find consecutive values of zero and stuck with this problem for a couple of hours.

I have a DataFrame like:

Day  |  ID  |  Values
-------------------
1    |  aa  |    0
1    |  aa  |    0
1    |  aa  |    0
1    |  aa  |    0
1    |  aa  |    2.5
1    |  aa  |    2.3
1    |  aa  |    0
1    |  aa  |    0
1    |  aa  |    0
2    |  aa  |    0
2    |  aa  |    0
2    |  aa  |    2.3
2    |  aa  |    0
1    |  bb  |    0
1    |  bb  |    0
1    |  bb  |    0
1    |  bb  |    0
1    |  bb  |    3.5

I want to find consecutive values of zeros like this:

Day  |  ID  |  Values   | consec_zeros
--------------------------------------
1    |  aa  |    0      |      0
1    |  aa  |    0      |      1
1    |  aa  |    0      |      2
1    |  aa  |    0      |      3
1    |  aa  |    2.5    |      4      # --> there were 4 of consecutive 0s 
1    |  aa  |    2.3    |      0      # 2.5 just destroy consecutive values
1    |  aa  |    0      |      0
1    |  aa  |    0      |      1
1    |  aa  |    0      |      2      
2    |  aa  |    0      |      0      # no 0s before this of Day 2
2    |  aa  |    0      |      1
2    |  aa  |    2.3    |      2
2    |  aa  |    0      |      0
1    |  bb  |    0      |      0     # --> no 0s before this in ID 'bb'
1    |  bb  |    0      |      1
1    |  bb  |    0      |      2
1    |  bb  |    0      |      3
1    |  bb  |    3.5    |      4

What I had attempted to do was:

g = df['Values'].ne(df['Values'].shift(1)).cumsum()
counts = df.groupby(['ID','Day',g])['Values'].transform('size')
df['consec_zeros'] = np.where(df['Values'].eq(0), counts, 0)

Since I'm new to this, please help and point me what I had done wrong.

Thank you in advance

2
  • Possible duplicate of GroupBy Pandas Count Consecutive Zero's Commented Jun 13, 2019 at 12:02
  • @ascripter I alrady looked at that thread but the output of consecutive zeos is not in increment format like my desired output. Commented Jun 13, 2019 at 12:05

1 Answer 1

4

Here is main problem add next counter value by first non zero values by GroupBy.cumcount, but also use it for thresh, in my solution was added 1 to counter for distinguish first value in counter:

g = df['Values'].ne(df['Values'].shift(1)).cumsum()
counts = df.groupby(['ID','Day',g])['Values'].cumcount() + 1
df['consec_zeros'] = np.where(df['Values'].eq(0), counts, 0)

#replace 0 to `NaN`s
a = df['consec_zeros'].mask(df['consec_zeros'].eq(0))
#add 1 to forward filling missing values by limit 1 per groups
df['consec_zeros'] = (np.where(a.isna(), 
                               a.groupby([df['ID'],df['Day']]).ffill(limit=1) + 1, 
                               df['consec_zeros']) - 1)
df['consec_zeros'] = df['consec_zeros'].fillna(0).astype(int)
print (df)
    Day  ID  Values  consec_zeros
0     1  aa     0.0             0
1     1  aa     0.0             1
2     1  aa     0.0             2
3     1  aa     0.0             3
4     1  aa     2.5             4
5     1  aa     2.3             0
6     1  aa     0.0             0
7     1  aa     0.0             1
8     1  aa     0.0             2
9     2  aa     0.0             0
10    2  aa     0.0             1
11    2  aa     2.3             2
12    2  aa     0.0             0
13    1  bb     0.0             0
14    1  bb     0.0             1
15    1  bb     0.0             2
16    1  bb     0.0             3
17    1  bb     3.5             4
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you with a good explanation. This really works. Thanks a lot!
@benji - I think found bug, forwrad filling need per groups, so edited answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.