1

this was closed with the wrong answer, please provide the answers

group_ = pd.DataFrame({'Name' : ['X','Y', 'X', 'X', 'Y','Y'],
     'a' : [3,6,4,7,9,3],
     'b' : [6,4,8,7,6,5],
     'h' : [34,43,63,34,54,56]})

  Name  a  b   h
0    X  3  6  34
1    Y  6  4  43
2    X  4  8  63
3    X  7  7  34
4    Y  9  6  54
5    Y  3  5  56

I would like to get the mean and std values of X and Y for all columns ( without listing them in agg, or anywhere has to be dynamic).

for mean I could have done this :

group_.groupby('Name').mean().mean(axis=1)

Name
X    18.444444
Y    20.666667
dtype: float64

np.mean([3,6,34,4,8,63,7,7,34]) = 18.444444

but for std not, doesn't work, because it gives the std of std of 3 columns.

group_.groupby('Name').std().std(axis=1)

Name
X    8.793713
Y    3.055050
dtype: float64

NOT CORRECT! X_std = np.std([3,6,34,4,8,63,7,7,34]) = 19.556818141059377

0

1 Answer 1

0

For std by all columns per groups first use DataFrame.melt and then GroupBy.std with ddof=0 (default pandas ddof=1):

a = group_.melt('Name').groupby('Name')['value'].std(ddof=0)

print (a)
Name
X    19.556818
Y    21.756225
Name: value, dtype: float64

X_std = np.std([3,6,34,4,8,63,7,7,34])
print (X_std)
19.556818141059377

For both aggregate functions:

df = group_.melt('Name').groupby('Name').agg(std=('value', lambda x: x.std(ddof=0)),
                                             mean=('value','mean'))

print (df)
            std       mean
Name                      
X     19.556818  18.444444
Y     21.756225  20.666667
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.