0

The end goal is to sum the values in the "Price" column based on multiple conditions in other columns. I am unsure how to do it though.

import pandas as pd

#creating random dataset
rnd = pd.DataFrame({'Status':['Accepted', 'Denied', 'Accepted', 'Accepted', 'Denied'],
                    'Letter':['A 02-02-19', 'AB 10-31-21', 'A 03-07-18', 'CD 11-13-21', 'A 04-05-21'], 
                    'Fruit':['Apple', 'Orange', 'Blueberry', 'Orange', 'Apple'],
                    'Price':[10,20,14,15,29]})

#output of dataframe
rnd
  1. Needs to have "Accepted" as a value in the "Status" Column. I know this can be done by doing

    ''' net = rnd.loc(rnd["Status"] == "Accepted", "Price"].sum() '''

However, I also needed it to be summed based on what occurs in the "Letter" Column. I don't care about the random dates in the value, only the characters in the beginning of the string. AB would be grouped in a different Group than A, which would also be grouped differently than CD. I am trying to figure out how to sum "Price" if I only wanted those who had "A" in the Letter column and "Accepted" in the Status column.

1
  • Kindly add your expected output dataframe Commented Dec 7, 2021 at 21:08

2 Answers 2

2

Try:

rnd.query('Status == "Accepted"')\
   .groupby(rnd['Letter'].str.split(' ').str[0])['Price'].sum()

Output:

Letter
A     24
CD    15
Name: Price, dtype: int64
Sign up to request clarification or add additional context in comments.

Comments

0

In general:

foo = self.db[(self.db['COL_NAME'] == val1) & (self.db['OTHER_COL'] != 0)]['COL_TO_SUM'].sum()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.