Add column to pandas dataframe based on previous values

Question

I have a dataframe with an observation number, and id, and a number

Obs#   Id    Value
--------------------
1        1   5.643
2        1   7.345
3        2   0.567
4        2   1.456

I want to calculate a new column that is the mean of the previous values of a specific id

I am trying to use something like this but it only acquires the previous value:

df.groupby('Id')['Value'].apply(lambda x: x.shift(1) ...

My question is how do I acquire the range of previous values filtered by the Id so I can calculate the mean ?

So the new column based on this example should be

user3483203 · Accepted Answer · 2018-07-18 19:04:33Z

8

It seems that you want expanding, then mean

df.groupby('Id').Value.expanding().mean()

Id
1.0  1    5.6430
     2    6.4940
2.0  3    0.5670
     4    1.0115
Name: Value, dtype: float64

edited Jul 18, 2018 at 19:04

answered Jul 18, 2018 at 19:02

user3483203

51.3k10 gold badges72 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

rafaelc Over a year ago

Never used expanding, nice +1

btathalon Over a year ago

To merge this groupby series back into the original df I used series = series.sortlevel(level=1) and then df['Mean'] = series.values

Bal Krishna Jha · Accepted Answer · 2018-07-18 19:34:49Z

1

You can also do it like:

df = pd.DataFrame({'Obs':[1,2,3,4],'Id':[1,1,2,2],'Value':[5.643,7.345, 0.567,1.456]})

df.groupby('Id')['Value'].apply(lambda x:  x.cumsum()/np.arange(1, len(x)+1))

It gives output as :

answered Jul 18, 2018 at 19:34

Bal Krishna Jha

7,5993 gold badges45 silver badges48 bronze badges

Collectives™ on Stack Overflow

Add column to pandas dataframe based on previous values

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related