3

I wanted to check if a dataframe has multiple duplicate values in a row. For instance for this dataset, I wanted to check the number of entries that have duplicates of 'STUDY_ID' and 'VISITCODE'. I tried to implement it like this but got a syntax error, I dont know why.

bp[(bp.duplicated('STUDY_ID') == True) && (bp.duplicated('VISITCODE') == True)]

Isnt it possible to implement what I want in this way? If so, what would be a better way?

0

1 Answer 1

3

You can change && to & for bitwise and and omit == True:

bp[(bp.duplicated('STUDY_ID') & bp.duplicated('VISITCODE')]

For check duplicates in multiple columns:

bp[bp.duplicated(['STUDY_ID', 'VISITCODE'])]
Sign up to request clarification or add additional context in comments.

7 Comments

wow that worked. But why doesnt && work? Does pytthon not support that?
no, python support and for scalars and & for arrays for logic AND.
I see, so && is never used? Or only in some situations? Because && lights up green in the compiler
Hard question, I have no idea why && lights up green in the compiler.
Do you need pb[bp.duplicated(['STUDY_ID', 'VISITCODE'])] ?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.