0

I want to collapse dataframe rows that match values for a given column but the rest of the columns have to be collapsed with different logic. Example:

City           ColumnA   ColumnB
Seattle        20        30
Seattle        30        20
Portland       25        25
Portland       10        40

I want to collapse by City and I want ColumnA to keep the lowest value and ColumnB to keep the mean value, for instance. The result should look like:

City           ColumnA   ColumnB
Seattle        20        25
Portland       10        32.5

This is just an example, in my real problem I want to apply a more complex logic rather than min() or mean().

What is the right, cleanest and simplest way of doing this? Thank you.

1
  • This is all covered in the relevant section of the docs. Commented Apr 27, 2018 at 18:46

1 Answer 1

1

use groubpy and .agg:

df.groupby('City', as_index=False).agg({'ColumnA':'min', 'ColumnB':'mean'})

       City  ColumnA  ColumnB
0  Portland       10     32.5
1   Seattle       20     25.0
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! what if the logic I want to apply is not min() or mean(), but something custom?
.agg can be a bit tricky, and it depends what you want to do, but take a look here, it might help

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.