Consecutive values in pandas column

Question

I have pandas df['realize']

time                      realize
2016-01-18 08:25:00     -46.369083
2016-01-19 14:30:00     -819.010738
2016-01-20 11:10:00    -424.955847
2016-01-21 07:15:00     27.523859
2016-01-21 16:10:00     898.522762
2016-01-25 00:00:00    761.063545

Where time is:

df.index = df['time']
df.index = pd.to_datetime(df.index)

Where df['realize'] is:

In: type(df['realize'])
Out: pandas.core.series.Series

I want to count consecutive values, rule is simple (df['realize'] > 0, df['realize'] < 0)

Expected out:

time                      realize    Consecutive
2016-01-18 08:25:00     -46.369083    1
2016-01-19 14:30:00     -819.010738   2
2016-01-20 11:10:00    -424.955847    3
2016-01-21 07:15:00     27.523859     1
2016-01-21 16:10:00     898.522762    2
2016-01-25 00:00:00    761.063545     3

I read about topics about loop, but didn't find what I need. Thanks in advance for help.

hey if there is a 7th row with negative value what should be it consecutive value? 4 or 1? — Bhanu Tez
– Bhanu Tez, Commented Mar 5, 2019 at 18:05
@Artem Reznov : in order to encourage the users. please consider upvoting and also mark as the answer for the solution you like. — Sandhya Thotakura
– Sandhya Thotakura, Commented Mar 5, 2019 at 18:38

yatu · Accepted Answer · 2019-03-05 20:03:56Z

7

You could do the following:

g = df.realize.gt(0).astype(int).diff().fillna(0).abs().cumsum()
df['Consecutive'] = df.groupby(g).realize.cumcount().add(1)

               time     realize       Consecutive
0 2016-01-18 08:25:00  -46.369083            1
1 2016-01-19 14:30:00 -819.010738            2
2 2016-01-20 11:10:00 -424.955847            3
3 2016-01-21 07:15:00   27.523859            1
4 2016-01-21 16:10:00  898.522762            2
5 2016-01-25 00:00:00  761.063545            3

Where the used grouper is obtained by taking the first differences (DataFrame.diff) of a boolean Series indicating whether or not realize is greater than 0:

diff = df.realize.gt(0).astype(int).diff().fillna(0).abs()
df.assign(diff = diff, grouper = g)

         time            realize     Consecutive diff  grouper
0 2016-01-18 08:25:00  -46.369083            1   0.0      0.0
1 2016-01-19 14:30:00 -819.010738            2   0.0      0.0
2 2016-01-20 11:10:00 -424.955847            3   0.0      0.0
3 2016-01-21 07:15:00   27.523859            1   1.0      1.0
4 2016-01-21 16:10:00  898.522762            2   0.0      1.0
5 2016-01-25 00:00:00  761.063545            3   0.0      1.0

edited Mar 5, 2019 at 20:03

answered Mar 5, 2019 at 17:26

yatu

88.7k12 gold badges93 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Artem Reznov Over a year ago

With first solution I recieving out df[['realiz_cash','Consecutive']]:

time                      realize    Consecutive 2016-01-18 08:25:00     -46.369083    282 2016-01-19 14:30:00     -819.010738   1 2016-01-20 11:10:00    -424.955847    890 2016-01-21 07:15:00     27.523859     1131 2016-01-21 16:10:00     898.522762    2 2016-01-25 00:00:00    761.063545     1583

yatu Over a year ago

I was working with a Dataframe, seems like yours could be a pd.Series ? Try doing at the beginning df = df.reset_index()

Artem Reznov Over a year ago

tried, still got this, let me be more clear - counting must resets if see changing in it, for example we got streak of 3 positive values, there count = 3, 4-th is negative value, there count must be reset to 1.

PMende Over a year ago

This is correctly answers the question and will be significantly more performant than the other posted response.

Artem Reznov Over a year ago

@yatu dont know who downvotes, but really appreciate. Checked new solution, it is correct, thanks a lot!

|

Bhanu Tez · Accepted Answer · 2019-03-05 18:13:54Z

0

My solution.

i=0;j=0
def cons(x):
    global i;global j
    if x>0:
        i += 1;j=0
        return i
    else:
        j += 1;i=0
        return j


df['consecutive'] = df['realize'].map(lambda x: cons(x))

I hope the solution is helpful.

answered Mar 5, 2019 at 18:13

Bhanu Tez

3063 silver badges15 bronze badges

2 Comments

Sandhya Thotakura Over a year ago

This solution is simple and elegant.

koldLight Over a year ago

Using globals and ; in python can't be considered "elegant"

Collectives™ on Stack Overflow

Consecutive values in pandas column

2 Answers 2

7 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related