I have the following df,
days days_1 days_2 period percent_1 percent_2 amount
3 5 4 1 0.2 0.1 100
2 1 3 4 0.3 0.1 500
9 8 10 6 0.4 0.2 600
10 7 8 11 0.5 0.3 700
10 5 6 7 0.7 0.4 800
I have the following logic that applies to each row of the df,
for each row in df:
if days < days_1:
amount_missed = 0
days_missed = 0
elif days_1 < days < days_2:
missed_percent = percent_1 - percent_2
amount_missed = amount * (missed_percent / 100)
days_missed = days - days_1
elif days_2 < days < period or days > period:
missed_percent = percent_2
amount_missed = amount * (missed_percent / 100)
days_missed = days - days_2
else:
amount_missed = 0
days_missed = 0
I am trying to use boolean mask and np.where to translate the above logic as follows,
cond1 = df['days_2'] < df['days']
cond2 = df['days'] < df['period']
cond3 = df['days'] > df['period']
cond4 = df['days'] >= df['days_1']
cond5 = df['days'] < df['days_2']
cond6 = df['days'] > df['days_1']
mask = ((cond1 & cond2) | cond3) & cond4
mask2 = cond5 & cond6
df['amount_missed'] = np.where(mask, df['amount'] * df['percent_2'] / 100, 0.0)
df['amount_missed'] = np.where(mask2, df['amount'] * (df['percent_1'] - df['percent_2']) / 100, 0.0)
df['days_missed'] = np.where(mask, df['days'] - df['days_2'], 0)
df['days_missed'] = np.where(mask2, df['days'] -df['days_1'], 0)
but the result of above code is not the same as the row iteration one, which should be,
{
'amount_missed': {0: 0.0, 1: 1.0, 2: 1.2, 3: 2.1, 4: 3.2},
'days_missed': {0: 0, 1: 1, 2: 1, 3: 2, 4: 4}
}
the boolean mask one generates the following result,
{
'amount_missed': {0: 0.0, 1: 0.9999999999999999, 2: 1.2, 3: 0.0, 4: 0.0},
'days_missed': {0: 0, 1: 1, 2: 1, 3: 0, 4: 0}
}
I am wondering how to fix it, and maybe there are other ways to replace df row iteration here.