Assign values in dataframe group using previous group row

Question

Consider the following dataframe:

A |  B |  C
_____________
a |  1 |  1
a |  5 |  NaN
b |  3 |  1
b |  4 |  NaN
c |  2 |  1
c |  2 |  NaN
a |  1 |  NaN
b |  3 |  NaN
c |  4 |  NaN

My goal is to update column C based on a rule that also includes the previous row, for each group. Just as an example, if the value from B column is smaller than the previous one, the C should have a value of 0, otherwise keep the value from the previous C.

So this would give me the following:

A |  B |  C
_____________
a |  1 |  1
a |  5 |  1
b |  3 |  1
b |  4 |  1
c |  2 |  1
c |  2 |  1
a |  1 |  0
b |  3 |  0
c |  4 |  1

I was thinking of using a kind of

df.groupby(A).apply(lambda x: x['C'].shift(1) if x['B'].shift(1) >= x['B'] else 0)

but obviously this does not work as apply cannot access former rows ( I think)

If all fails, I would build individual DF's from each group and modify them individually, so not to include another group's rows in the result, but I believe there must be a more elegant solution using the original dataframe.

Any suggestions?

keep the value from the previous C means 4 ( the last row) should be 0. what's the reason for it being 1? — sammywemmy
– sammywemmy, Commented May 5, 2021 at 13:02
@sammywemmy, in my example, for the last row, the B value is 4, and because the last row relates to the "c" group, it is bigger than the last B value for that group so we keep the value of 1 for C, which is the last one. If the value of B was 1, the C columnd would have had, in that case, a value of 0 — habarnam
– habarnam, Commented May 5, 2021 at 13:07

Nk03 · Accepted Answer · 2021-05-05 13:10:43Z

2

Try:

import numpy as np
def fill(x):
    x['C'] = x['C'].fillna(method='ffill')
    x['C'] = np.where(x['B'].values <= x['B'].shift(1).values, 0, x['C'])
    return x
df = df.groupby('A').apply(fill)

Here, the idea is to 1st fill the NAN values with the previous value then replace the value with 0 if the condition is satisfied.

edited May 5, 2021 at 13:10

answered May 5, 2021 at 12:44

Nk03

15k2 gold badges11 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

habarnam Over a year ago

thanks for the np.where solution, i never used it before. I am still getting some weird result on my actual data, as some nan's appear after applying the function, even if I ffill before like you suggested. I will check and see where the problem is

Nk03 Over a year ago

np.where is super fast if you wanna perform some if-else condition.

habarnam Over a year ago

i'm gonna mark your answer as correct, but for my actual dataset it didn't work as I don't think I managed to put everything in context in one question. But it's a start. Thanks

Collectives™ on Stack Overflow

Assign values in dataframe group using previous group row

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related