I have a DataFrame, where all columns are in binary form, i.e; taking either 0 or 1 as value as shown below.
import pandas as pd
df = pd.DataFrame({'qualified_exam':[1,1,0,0,1,1],'gender_M':[1,0,1,0,1,1],'employed':[1,0,0,1,1,1],'married':[0,1,0,1,1,0]})
print(df)
qualified_exam gender_M employed married
0 1 1 1 0
1 1 0 0 1
2 0 1 0 0
3 0 0 1 1
4 1 1 1 1
5 1 1 1 0
I want to create a DataFrame where I want to measure sum/mean of column qualified _exam by grouping by all the remaining 3 columns individually - gender_M, employed, married.
Final DataFrame should look something like this -
sum_0 sum_1 mean_0 mean_1
gender_M 1 3 0.50 0.75
employed 1 3 0.50 0.75
married 2 2 0.66 0.66
I tried it by doing groupby() and agg() for each of the 3 columns individually, and then appending the result one by one. This is too cumbersome. I am sure there is a better way.