2

I have pandas df['realize']

time                      realize
2016-01-18 08:25:00     -46.369083
2016-01-19 14:30:00     -819.010738
2016-01-20 11:10:00    -424.955847
2016-01-21 07:15:00     27.523859
2016-01-21 16:10:00     898.522762
2016-01-25 00:00:00    761.063545

Where time is:

df.index = df['time']
df.index = pd.to_datetime(df.index)

Where df['realize'] is:

In: type(df['realize'])
Out: pandas.core.series.Series

I want to count consecutive values, rule is simple (df['realize'] > 0, df['realize'] < 0)

Expected out:

time                      realize    Consecutive
2016-01-18 08:25:00     -46.369083    1
2016-01-19 14:30:00     -819.010738   2
2016-01-20 11:10:00    -424.955847    3
2016-01-21 07:15:00     27.523859     1
2016-01-21 16:10:00     898.522762    2
2016-01-25 00:00:00    761.063545     3

I read about topics about loop, but didn't find what I need. Thanks in advance for help.

3
  • hey if there is a 7th row with negative value what should be it consecutive value? 4 or 1? Commented Mar 5, 2019 at 18:05
  • 1
    @BhanuTez it must be 1 Commented Mar 5, 2019 at 18:09
  • 1
    @Artem Reznov : in order to encourage the users. please consider upvoting and also mark as the answer for the solution you like. Commented Mar 5, 2019 at 18:38

2 Answers 2

7

You could do the following:

g = df.realize.gt(0).astype(int).diff().fillna(0).abs().cumsum()
df['Consecutive'] = df.groupby(g).realize.cumcount().add(1)

               time     realize       Consecutive
0 2016-01-18 08:25:00  -46.369083            1
1 2016-01-19 14:30:00 -819.010738            2
2 2016-01-20 11:10:00 -424.955847            3
3 2016-01-21 07:15:00   27.523859            1
4 2016-01-21 16:10:00  898.522762            2
5 2016-01-25 00:00:00  761.063545            3

Where the used grouper is obtained by taking the first differences (DataFrame.diff) of a boolean Series indicating whether or not realize is greater than 0:

diff = df.realize.gt(0).astype(int).diff().fillna(0).abs()
df.assign(diff = diff, grouper = g)

         time            realize     Consecutive diff  grouper
0 2016-01-18 08:25:00  -46.369083            1   0.0      0.0
1 2016-01-19 14:30:00 -819.010738            2   0.0      0.0
2 2016-01-20 11:10:00 -424.955847            3   0.0      0.0
3 2016-01-21 07:15:00   27.523859            1   1.0      1.0
4 2016-01-21 16:10:00  898.522762            2   0.0      1.0
5 2016-01-25 00:00:00  761.063545            3   0.0      1.0
Sign up to request clarification or add additional context in comments.

7 Comments

With first solution I recieving out df[['realiz_cash','Consecutive']]: time realize Consecutive 2016-01-18 08:25:00 -46.369083 282 2016-01-19 14:30:00 -819.010738 1 2016-01-20 11:10:00 -424.955847 890 2016-01-21 07:15:00 27.523859 1131 2016-01-21 16:10:00 898.522762 2 2016-01-25 00:00:00 761.063545 1583
I was working with a Dataframe, seems like yours could be a pd.Series ? Try doing at the beginning df = df.reset_index()
tried, still got this, let me be more clear - counting must resets if see changing in it, for example we got streak of 3 positive values, there count = 3, 4-th is negative value, there count must be reset to 1.
This is correctly answers the question and will be significantly more performant than the other posted response.
@yatu dont know who downvotes, but really appreciate. Checked new solution, it is correct, thanks a lot!
|
0

My solution.

i=0;j=0
def cons(x):
    global i;global j
    if x>0:
        i += 1;j=0
        return i
    else:
        j += 1;i=0
        return j


df['consecutive'] = df['realize'].map(lambda x: cons(x))

I hope the solution is helpful.

2 Comments

This solution is simple and elegant.
Using globals and ; in python can't be considered "elegant"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.