1

I have a data frame as below:

user | profit
-------------
Anna |    1.0
Bell |    2.0
Anna |    2.0
Chad |    5.0
Bell |    4.0
Anna |    3.0

that I need to compute each row's mean value on the users' level, that is, each time I see the same user I compute his/her profit mean thus far.

For instance, Anna's first profit mean is 1.0 and her second profit mean becomes 1.5, and so on.

The desired result looks like:

user | profit | mean
--------------------
Anna |    1.0 |  1.0
Bell |    2.0 |  2.0
Anna |    2.0 |  1.5
Chad |    5.0 |  5.0
Bell |    4.0 |  3.0
Anna |    3.0 |  2.0

Any suggestions to do so in Python/Pandas?

import pandas as pd

record = pd.DataFrame({
    "user": ("Anna", "Bell", "Anna", "Chad", "Bell", "Anna"), 
    "profit": (1.0, 2.0, 2.0, 5.0, 4.0, 3.0)
})

Thanks!

2 Answers 2

2

Use GroupBy.transform with rolling and mean:

df['mean'] = (df.groupby('user')['profit']
                .transform(lambda x: x.rolling(len(x), min_periods=1).mean()))
print (df)
   user  profit  mean
0  Anna     1.0   1.0
1  Bell     2.0   2.0
2  Anna     2.0   1.5
3  Chad     5.0   5.0
4  Bell     4.0   3.0
5  Anna     3.0   2.0
Sign up to request clarification or add additional context in comments.

4 Comments

I'm thinking something like dividing cumsum and cumcount?
Great one liner solution!
@Rock I'm allergic to one-liners but the idea is smart! And I think this is readable! (one-liners can sometimes become very hard to read)
@jezrael Hey, I figured we could use cumsum() and divide with cumcount +1? What do you think? I think it is more readalbe at least. Speed-wise they look the same.
1

I think we can use cumsum() and divide with the count so far.

g = df.groupby('user')['profit']
df['mean'] = g.cumsum() / (g.cumcount() + 1)

Full example

import pandas as pd
import numpy as np

df = pd.DataFrame({
    "user": ("Anna", "Bell", "Anna", "Chad", "Bell", "Anna"), 
    "profit": (1.0, 2.0, 2.0, 5.0, 4.0, 3.0)
})

g = df.groupby('user')['profit']
df['mean'] = g.cumsum() / (g.cumcount() + 1)

print(df)

Returns:

   user  profit  mean
0  Anna     1.0   1.0
1  Bell     2.0   2.0
2  Anna     2.0   1.5
3  Chad     5.0   5.0
4  Bell     4.0   3.0
5  Anna     3.0   2.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.