Delete rows with duplicate values in a dataframe

Question

I have a dataframe with codes, registered in different times (columns). Like this way:

   time1 time2 time3  time4
0  A09.9 B25   A02.2  NaN
1  B21   J2    Z23.1  J2
2  C21.2 C03   NaN    NaN

I need to remove the rows with duplicate values in any column, so in this case it would be the second row.

   time1 time2 time3  time4
0  A09.9 B25   A02.2  NaN
1  C21.2 C03   NaN    NaN

I haven't found any efficient way, just going from row to row.

BENY · Accepted Answer · 2019-09-13 21:57:30Z

6

We using nuinque with notnull value count

df[df.nunique(1)==df.notnull().sum(1)]
Out[154]: 
   time1 time2  time3 time4
0  A09.9   B25  A02.2   NaN
2  C21.2   C03    NaN   NaN

answered Sep 13, 2019 at 21:57

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jottbe Over a year ago

That's nice! I was thinking about a version with melt combined with groupby, but I guess that would not have come close to the performance of this solution.

Collectives™ on Stack Overflow

Delete rows with duplicate values in a dataframe

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related