pandas dataframe rolling window with groupby

Question

I can add a new column c that is a sum of the last two values in b as shown below...

df['c'] = df.b.rolling(window = 2).sum().shift()

df
    a   b     c
0   1   3   NaN
1   1   0   NaN
2   0   6   3.0
3   1   0   6.0
4   0   0   6.0
5   1   7   0.0
6   0   0   7.0
7   0   7   7.0
8   1   4   7.0
9   1   2   11.0

...however, what if I want to group by a first? E.g. I can do this:

df['c'] = df.groupby(['a'])['b'].shift(1) + df.groupby(['a'])['b'].shift(2)

Is there a more elegant way for summing a large number of shifts (1, 2, ...n) on a group?

piRSquared · Accepted Answer · 2016-11-20 06:38:02Z

12

f = lambda x: x.rolling(2).sum().shift()
df['c'] = df.groupby('a').b.apply(f)

df

answered Nov 20, 2016 at 6:38

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Brian Bien Over a year ago

caution: combining the rolling() and shift() methods in a lambda function (just the way piRSquared presented it) is necessary: it causes both to be applied to the group (desirable); incorrect behavior occurs in this case: df['c'] = df.groupby('a').b.rolling(2).sum().shift() since the shift() operation occurs in a non-grouped context

Brian Bien Over a year ago

Sorry, I hope I didn't add confusion: I meant to say that your approach is correct and that an alternate approach, which may seem to be a syntactic preference, will lead to unintended behavior

Collectives™ on Stack Overflow

pandas dataframe rolling window with groupby

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related