Pandas dataframe drop duplicates based in another column value

Question

I have a dataframe with duplicates:

timestamp id ch is_eval. c
  12.     1.  1.  False. 2
  13.     1.  0.  False. 1
  12.     1.  1.  True.  4
  13.     1   0.  False. 3

When there are duplicated, it is always when I want to drop_duplicates with the key (timestamp,id,ch) but keep the row where is_eval is True. Meaning, if there is a row with is_eval==True then keep it. Otherwise, it doesnt matter. So the output here should be:

  12.     1.  1.  True.  4
  13.     1   0.  False. 1

How can I do it?

jezrael · Accepted Answer · 2022-06-01 07:28:43Z

3

Use:

df = df.sort_values('is_eval', kind='mergesort', ascending=False).drop_duplicates(['timestamp','id','ch'])
print (df)
   timestamp  id  ch  is_eval  c
2         12   1   1     True  4
1         13   1   0    False  1

answered Jun 1, 2022 at 7:28

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas dataframe drop duplicates based in another column value

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related