Suppose I have the following dataframe :
import pandas as pd
df = pd.DataFrame(
{
'state': ['CA', 'WA', 'CO', 'AZ'] * 3,
'office_id': list(range(1, 7)) * 2,
'sales': [pd.np.random.randint(100000, 999999) for _ in range(12)]
}
)
Here it is :
office_id sales state
0 1 903325 CA
1 2 364594 WA
2 3 737728 CO
3 4 239378 AZ
4 5 833003 CA
5 6 501536 WA
6 1 920821 CO
7 2 879602 AZ
8 3 661818 CA
9 4 548888 WA
10 5 842459 CO
11 6 906791 AZ
Now I do a groupby operation on office_id and states :
df.groupby(["office_id", "state"]).aggregate({"sales": "sum"})
This lead to :
sales
office_id state
1 CA 903325
CO 920821
2 AZ 879602
WA 364594
3 CA 661818
CO 737728
4 AZ 239378
WA 548888
5 CA 833003
CO 842459
6 AZ 906791
WA 501536
Is it possible to add a row, for each office_id, with a new index total for example which is the sum over each state of the sales column ?
I can compute it by grouping by "office_id" and sum but I obtain a new DataFrame and I do not succeed in merging it.