Can someone please explain what is actually going on in aggfunc here -
df.pivot_table(values='Loan_Status', index=['Credit_History'],
aggfunc=lambda x: x.map({'Y':1,'N':0}).mean())
Thank you
Below example should illustrate what's happening. The Loan_Status values are aggregated by Credit_History according to the logic "add up number of Y's and divide by total number of observations".
import pandas as pd
df = pd.DataFrame([['Y', 'A'], ['N', 'B'], ['Y', 'C'], ['N', 'A'], ['Y', 'C']],
columns=['Loan_Status', 'Credit_History'])
df.pivot_table(values='Loan_Status', index=['Credit_History'],
aggfunc=lambda x: x.map({'Y':1,'N':0}).mean())
# Loan_Status
# Credit_History
# A 0.5
# B 0.0
# C 1.0
mapfunction, that is a method being accessed on the argument to the lambda functionpandas.Series.map.