Pandas Dataframe groupby()

Question

I have a dataset that looks similar to this:

Name	Status	Activity
Jane	student	yes
John	businessman	yes
Elle	student	no
Chris	policeman	yes
John	businessman	no
Clay	businessman	yes

I want to group the dataset by Status and Name which have Activity as a 'yes' and count the Name. If it at least has one 'yes', it will be counted.

Basically, this is the output that I want:

student 1 Jane

businessman 2 John, Clay

policeman 1 Chris

I've tried these codes:

cb = (DataFrame.groupby(['Name', 'Status']).sum(DataFrame['Activity'].eq('yes')))

cb = (DataFrame.groupby(['Name', 'Status']).any(DataFrame['Activity'].eq('yes')))

cb = (DataFrame.groupby(['Name', 'Status']).nunique(DataFrame['Activity'].eq('yes')))

but, all of them give this error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Please help me to fix this code. Thank you in advance!

Panda Kim · Accepted Answer · 2022-12-26 13:51:22Z

2

Example

data = {'Name': {0: 'Jane', 1: 'John', 2: 'Elle', 3: 'Chris', 4: 'John', 5: 'Clay'},
        'Status': {0: 'student', 1: 'businessman', 2: 'student', 3: 'policeman', 4: 'businessman', 5: 'businessman'},
        'Activity': {0: 'yes', 1: 'yes', 2: 'no', 3: 'yes', 4: 'no', 5: 'yes'}}
df = pd.DataFrame(data)

Code

out = (df[df['Activity'].eq('yes')]
       .groupby('Status', sort=False)['Name'].agg(['count', ', '.join]))

out

            count   join
Status      
student     1       Jane
businessman 2       John, Clay
policeman   1       Chris

answered Dec 26, 2022 at 13:51

Panda Kim

13.7k2 gold badges8 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

rainy days. Over a year ago

thanks for the answer, it works well, but why does when I deploy this code to the real data, it counts the activity number, while I want the number of distinct names that have activity as a yes?

Panda Kim Over a year ago

It's my job to solve the examples and yours to put them into your dataset. It is not difficult to get your output using solution. For various reasons, I only solve examples and do not take additional questions. The following function will help you pandas.pydata.org/docs/reference/api/pandas.unique.html

Abhishek · Accepted Answer · 2022-12-26 15:07:47Z

1

Check below:

dd = df.query("Activity != 'no'").\
groupby('Status').\
agg({'Name':[','.join,'count']}).reset_index()

dd.columns = ['Status','Names','count']

dd.head()

Output:

answered Dec 26, 2022 at 15:07

Abhishek

1,6252 gold badges15 silver badges16 bronze badges

Collectives™ on Stack Overflow

Pandas Dataframe groupby()

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related