Find Duplicate rows from df. Python

Question

df = 

Name    Age City
Jack    34  Sydney
Riti    30  Delhi
Aadi    16  New York
Riti    30  Delhi
Riti    30  Delhi
Riti    30  Mumbai
Aadi    40  London
Sachin  30  Delhi

df[df.duplicated(keep='last')]

The above code gives the list of duplicated. But what I need is if the df contains atleast 1 duplicate, then it should return The df contains duplicate rows.

Hi, When sharing a DF, could please paste a format that is directly usable by others ? Like a dict or other, to avoid us to rewrite it ;) — azro
– azro, Commented Mar 19, 2020 at 9:06

Sayandip Dutta · Accepted Answer · 2020-03-19 09:07:33Z

1

You can use any:

>>> df
     Name  Age     City
0    Jack   34   Sydney
1    Riti   30    Delhi
2    Aadi   16  NewYork
3    Riti   30    Delhi
4    Riti   30    Delhi
5    Riti   30   Mumbai
6    Aadi   40   London
7  Sachin   30    Delhi
>>> df.duplicated().any()
True
>>> 'The df contains duplicates' if df.duplicated().any() else 'no duplicates' 
'The df contains duplicates'

answered Mar 19, 2020 at 9:07

Sayandip Dutta

15.9k4 gold badges27 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Seleme · Accepted Answer · 2020-03-19 09:08:37Z

1

duplicated actually returns a Series containing boolean values for each row. If the row has a duplicate then the corresponding row in the returned Series has a "True" value.

Hence, you can do the below:

df.duplicated().any()

It will return True if there is any duplicate in your DataFrame.

answered Mar 19, 2020 at 9:08

Seleme

2511 silver badge8 bronze badges

Collectives™ on Stack Overflow

Find Duplicate rows from df. Python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related