Collapse rows in Pandas dataframe with different logic per column [duplicate]

Question

I want to collapse dataframe rows that match values for a given column but the rest of the columns have to be collapsed with different logic. Example:

City           ColumnA   ColumnB
Seattle        20        30
Seattle        30        20
Portland       25        25
Portland       10        40

I want to collapse by City and I want ColumnA to keep the lowest value and ColumnB to keep the mean value, for instance. The result should look like:

City           ColumnA   ColumnB
Seattle        20        25
Portland       10        32.5

This is just an example, in my real problem I want to apply a more complex logic rather than min() or mean().

What is the right, cleanest and simplest way of doing this? Thank you.

This is all covered in the relevant section of the docs.

DSM
– DSM

2018-04-27 18:46:45 +00:00
Commented Apr 27, 2018 at 18:46 — DSM
– DSM, Commented Apr 27, 2018 at 18:46

sacuL · Accepted Answer · 2018-04-27 18:44:40Z

1

use groubpy and .agg:

df.groupby('City', as_index=False).agg({'ColumnA':'min', 'ColumnB':'mean'})

       City  ColumnA  ColumnB
0  Portland       10     32.5
1   Seattle       20     25.0

answered Apr 27, 2018 at 18:44

sacuL

51.6k9 gold badges88 silver badges115 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Didac Perez Parera Over a year ago

Thanks! what if the logic I want to apply is not min() or mean(), but something custom?

sacuL Over a year ago

.agg can be a bit tricky, and it depends what you want to do, but take a look here, it might help

Collectives™ on Stack Overflow

Collapse rows in Pandas dataframe with different logic per column [duplicate]

1 Answer 1

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Linked

Related