Pandas, group by count and add count to original dataframe?

Question

When trying to count rows with similar 'kind' in data frame:

import pandas as pd

items = [('aaa','aaa text 1'), ('aaa','aaa text 2'), ('aaa','aaa text 3'),
         ('bb', 'bb text 1'), ('bb', 'bb text 2'), ('bb', 'bb text 3'), 
         ('bb', 'bb text 4'),
         ('cccc','cccc text 1'), ('cccc','cccc text 2'),
         ('dd', 'dd text 1'),
         ('e', 'e text 1'),
         ('fff', 'fff text 1'),
        ]

df = pd.DataFrame(items, columns=['kind', 'msg'])
df

    kind    msg
0   aaa     aaa text 1
1   aaa     aaa text 2
2   aaa     aaa text 3
3   bb      bb text 1
4   bb      bb text 2
5   bb      bb text 3
6   bb      bb text 4
7   cccc    cccc text 1
8   cccc    cccc text 2
9   dd      dd text 1
10  e       e text 1
11  fff     fff text 1

This code works:

df = df[['kind']].groupby(['kind'])['kind'] \
                         .count() \
                         .reset_index(name='count') \
                         .sort_values(['count'], ascending=False) \
                         .head(5)

df

Resulting in:

    kind      count
    0   aaa   1
    1   bb    1
    2   cccc  1
    3   dd    1
    4   e     1

Yet, how can one get a data frame with all columns as in original one plus 'count' column? So the result should have columns 'kind', 'msg', 'count' in this order?

Also, how to sort this resulting data frame in descending order of count?

MaxU - stand with Ukraine · Accepted Answer · 2017-07-27 09:48:34Z

18

IIUC

In [247]: df['count'] = df.groupby('kind').transform('count')

In [248]: df
Out[248]:
    kind          msg  count
0    aaa   aaa text 1      3
1    aaa   aaa text 2      3
2    aaa   aaa text 3      3
3     bb    bb text 1      4
4     bb    bb text 2      4
5     bb    bb text 3      4
6     bb    bb text 4      4
7   cccc  cccc text 1      2
8   cccc  cccc text 2      2
9     dd    dd text 1      1
10     e     e text 1      1
11   fff   fff text 1      1

sorting:

In [249]: df.sort_values('count', ascending=False)
Out[249]:
    kind          msg  count
3     bb    bb text 1      4
4     bb    bb text 2      4
5     bb    bb text 3      4
6     bb    bb text 4      4
0    aaa   aaa text 1      3
1    aaa   aaa text 2      3
2    aaa   aaa text 3      3
7   cccc  cccc text 1      2
8   cccc  cccc text 2      2
9     dd    dd text 1      1
10     e     e text 1      1
11   fff   fff text 1      1

answered Jul 27, 2017 at 9:48

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

leonard Over a year ago

The transform('count') function returned multiple colums, so assigning it did not work. I had to specify which column I wanted to return. Is there something I am missing @MaxU?

MaxU - stand with Ukraine Over a year ago

@leonard, I don’t know what do you want to achieve and how your dataset looks like. I’d recommend you to ask a new question and specify there a small sample dataset and your desired dataset

shantanu pathak Over a year ago

@leonard your query is exactly answered in second answer

Community · Accepted Answer · 2021-06-30 21:05:11Z

17

Here is the simple code to count the frequencies and add a column to the data frame when grouping by the kind column.

df['count'] = df.groupby('kind')['kind'].transform('count')

edited Jun 30, 2021 at 21:05

CommunityBot

11 silver badge

answered Sep 25, 2018 at 5:47

Shubham Singh Chauhan

2092 silver badges7 bronze badges

1 Comment

shantanu pathak Over a year ago

This should be accepted answer .. as per my knowledge

Olsgaard · Accepted Answer · 2022-06-15 11:42:40Z

0

This can also be achieved as part of a chain by

df.assign(
  count=lambda x: x.groupby('kind')['kind'].transform('count')
)

This is useful if you already have a chained expression, or you need to pass the dataframe with the extra column to a function but don't want to overwrite the dataframe.

answered Jun 15, 2022 at 11:42

Olsgaard

1,6601 gold badge12 silver badges22 bronze badges

Collectives™ on Stack Overflow

Pandas, group by count and add count to original dataframe?

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related