Assign column status retrospectively in pandas

Question

I have created the following pandas dataframe:

import pandas as pd
import numpy as np

ds = {'col1' : [234,321,284,286,287,300,301,303,305,299,288,300,299,287,286,280,279,270,269,301]}

df = pd.DataFrame(data=ds)

The dataframe looks like this:

display(df)

    col1
0   234
1   321
2   284
3   286
4   287
5   300
6   301
7   303
8   305
9   299
10  288
11  300
12  299
13  287
14  286
15  280
16  279
17  270
18  269
19  301

I have then created two columns (cnsIncr , cnsDecr), which calculate the consecutive increase and decrease in col1 respectively.

cnsIncr = []
cnsDecr = []

col1 = np.array(df['col1'])

for i in range(len(df)):
    cnsIncr.append(0)
    cnsDecr.append(0)
    if(col1[i] > col1[i-1]):
        cnsIncr[i] = cnsIncr[i-1]+1
    else:
        cnsIncr[i] = 0

    if(col1[i] < col1[i-1]):
        cnsDecr[i] = cnsDecr[i-1]+1
    else:
        cnsDecr[i] = 0

df['cnsIncr'] = cnsIncr
df['cnsDecr'] = cnsDecr

The resulting dataframe looks as follows:

I need to create a columns called Trend which is populated as follows:

if the column cnsIncr contains a value greater or equal to 5, then Trend = UpTrend starting from the record in which consIncr = 0 and ending at the record for which cnsIncr >= 5;
if the column cnsDecr contains a value greater or equal to 5, then Trend = DownTrend starting from the record in which cnsDecr = 0 and ending at the record for which cnsDecr >= 5;
the neither the column cnsIncr nor the column cnsDecr contain a value of 5, then Trend = NoTrend

So, from the example above, the resulting dataframe would look like this:

How can I do this?

mozway · Accepted Answer · 2025-07-25 08:39:42Z

I think a simple approach, assuming groups always start with a 0 and are increasing until a reset, would be to groupby.transform for stretches starting with 0 and assigning the value based on the last member of the group:

N = 5

# form groups starting with 0 for each column
g1 = df['cnsIncr'].eq(0).cumsum()
g2 = df['cnsDecr'].eq(0).cumsum()

# identify if the last value is >= 5
m1 = df['cnsIncr'].groupby(g1).transform('last').ge(N)
m2 = df['cnsDecr'].groupby(g2).transform('last').ge(N)

# assign the names based on the masks
df['Trend'] = np.select([m1, m2], ['TrendUp', 'TrendDown'], 'NoTrend')

Output:

    col1  cnsIncr  cnsDecr      Trend
0    234        0        1    NoTrend
1    321        1        0    NoTrend
2    284        0        1    TrendUp
3    286        1        0    TrendUp
4    287        2        0    TrendUp
5    300        3        0    TrendUp
6    301        4        0    TrendUp
7    303        5        0    TrendUp
8    305        6        0    TrendUp
9    299        0        1    NoTrend
10   288        0        2    NoTrend
11   300        1        0  TrendDown
12   299        0        1  TrendDown
13   287        0        2  TrendDown
14   286        0        3  TrendDown
15   280        0        4  TrendDown
16   279        0        5  TrendDown
17   270        0        6  TrendDown
18   269        0        7  TrendDown
19   301        1        0    NoTrend

Intermediates:

    col1  cnsIncr  cnsDecr      Trend  g1  g2     m1     m2
0    234        0        1    NoTrend   1   0  False  False
1    321        1        0    NoTrend   1   1  False  False
2    284        0        1    TrendUp   2   1   True  False
3    286        1        0    TrendUp   2   2   True  False
4    287        2        0    TrendUp   2   3   True  False
5    300        3        0    TrendUp   2   4   True  False
6    301        4        0    TrendUp   2   5   True  False
7    303        5        0    TrendUp   2   6   True  False
8    305        6        0    TrendUp   2   7   True  False
9    299        0        1    NoTrend   3   7  False  False
10   288        0        2    NoTrend   4   7  False  False
11   300        1        0  TrendDown   4   8  False   True
12   299        0        1  TrendDown   5   8  False   True
13   287        0        2  TrendDown   6   8  False   True
14   286        0        3  TrendDown   7   8  False   True
15   280        0        4  TrendDown   8   8  False   True
16   279        0        5  TrendDown   9   8  False   True
17   270        0        6  TrendDown  10   8  False   True
18   269        0        7  TrendDown  11   8  False   True
19   301        1        0    NoTrend  11   9  False  False

Note that you don't need to compute the cnsIncr/cnsDecr columns with a loop:

N = 5

g1 = df['cnsIncr'].eq(0).cumsum()
g2 = df['cnsDecr'].eq(0).cumsum()

m1 = df['cnsIncr'].groupby(g1).transform('last').gt(N)
m2 = df['cnsDecr'].groupby(g2).transform('last').gt(N)

df['Trend'] = np.select([m1, m2], ['TrendUp', 'TrendDown'], 'NoTrend')

Output with intermediates:

    col1      Trend  g1  g2     m1     m2
0    234    NoTrend   1   0  False  False
1    321    NoTrend   1   1  False  False
2    284    TrendUp   2   1   True  False
3    286    TrendUp   2   2   True  False
4    287    TrendUp   2   3   True  False
5    300    TrendUp   2   4   True  False
6    301    TrendUp   2   5   True  False
7    303    TrendUp   2   6   True  False
8    305    TrendUp   2   7   True  False
9    299    NoTrend   3   7  False  False
10   288    NoTrend   4   7  False  False
11   300  TrendDown   4   8  False   True
12   299  TrendDown   5   8  False   True
13   287  TrendDown   6   8  False   True
14   286  TrendDown   7   8  False   True
15   280  TrendDown   8   8  False   True
16   279  TrendDown   9   8  False   True
17   270  TrendDown  10   8  False   True
18   269  TrendDown  11   8  False   True
19   301    NoTrend  11   9  False  False

Collectives™ on Stack Overflow

Assign column status retrospectively in pandas

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related