1

I have a pandas dataframe like below for the columns value_to_sum and indicator. I'd like to sum all values within value_to_sum up to and including the most recent value within that column where indicator == True. If indicator == False, I do not want to sum.

row value_to_sum indicator desired_outcome
1 1 True NaN
2 3 True 1
3 1 False NaN
4 2 False NaN
5 4 False NaN
6 6 True 10
7 2 True 6
8 3 False NaN

How can I achieve the values under desired_outcome?

4
  • You wrote: "up to and including the most recent value" but your example in row 2 shows otherwise. Or did I misunderstood something here? Othweise why is row 6, a 9? Commented May 5, 2021 at 16:58
  • I don't want to include the value within value_to_sum on that same row within the sum under desired_outcome. But the sum should be inclusive of the last row where indicator == True. So "most recent" means the most recent row previous to the row we're at. Commented May 5, 2021 at 17:00
  • Ok, but then why is row 6 equal 9? Commented May 5, 2021 at 17:01
  • Good catch! I summed that one incorrectly! The sum for row 6 should be the sums of rows 2-5 (3+1+2+4 = 10). Commented May 5, 2021 at 17:05

1 Answer 1

1

You can set a group based on the .cumsum() of True values of column indicator and then use .groupby() together with .transform() to get the sum of value_to_sum of each group.

Then, for indicator == True, since the desired outcome is up to the previous row, we get the value of desired_outcome from last row by using .shift(). At the same time, for indicator == False, we set the value of desired_outcome to NaN. These last 2 steps are done altogether by a call to np.where().

df['desired_outcome'] = df.assign(group=df['indicator'].cumsum()).groupby('group')['value_to_sum'].transform('sum')
df['desired_outcome'] = np.where(df['indicator'], df['desired_outcome'].shift(), np.nan)

Result:

print(df)



   row  value_to_sum  indicator  desired_outcome
0    1             1       True              NaN
1    2             3       True              1.0
2    3             1      False              NaN
3    4             2      False              NaN
4    5             4      False              NaN
5    6             6       True             10.0
6    7             2       True              6.0
7    8             3      False              NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.