1
df = 

Name    Age City
Jack    34  Sydney
Riti    30  Delhi
Aadi    16  New York
Riti    30  Delhi
Riti    30  Delhi
Riti    30  Mumbai
Aadi    40  London
Sachin  30  Delhi
df[df.duplicated(keep='last')]

The above code gives the list of duplicated. But what I need is if the df contains atleast 1 duplicate, then it should return The df contains duplicate rows.

1
  • 1
    Hi, When sharing a DF, could please paste a format that is directly usable by others ? Like a dict or other, to avoid us to rewrite it ;) Commented Mar 19, 2020 at 9:06

2 Answers 2

1

You can use any:

>>> df
     Name  Age     City
0    Jack   34   Sydney
1    Riti   30    Delhi
2    Aadi   16  NewYork
3    Riti   30    Delhi
4    Riti   30    Delhi
5    Riti   30   Mumbai
6    Aadi   40   London
7  Sachin   30    Delhi
>>> df.duplicated().any()
True
>>> 'The df contains duplicates' if df.duplicated().any() else 'no duplicates' 
'The df contains duplicates'
Sign up to request clarification or add additional context in comments.

Comments

1

duplicated actually returns a Series containing boolean values for each row. If the row has a duplicate then the corresponding row in the returned Series has a "True" value.

Hence, you can do the below:

df.duplicated().any()

It will return True if there is any duplicate in your DataFrame.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.