1

I have a df with columns - section,classA,classB,classC.I'm trying to find duplicates values in a row.

df

                section       classA          classB          classC      

            0       A          paul            john             paul
            1       B          john            mark             tony
            2       C          leo             leo              leo
            3       D          tony            tony             mark
            4       E          paul
            5       F                          mark             mark

Final df

                section       classA          classB          classC        duplicate

            0       A          paul            john             paul          True
            1       B          john            mark             tony          False
            2       C          leo             leo              leo           True
            3       D          tony            tony             mark          True
            4       E          paul                                           False
            5       F                          mark             mark          True
            6       G                                                         False

I tried comparing each row, How to handle if row is empty?

1 Answer 1

1

If the empty cells are empty strings (""), you can use set():

df["duplicate"] = df.apply(
    lambda x: len(set(x[x != ""])) != len(x[x != ""]), axis=1
)

print(df)

Prints:

  section classA classB classC  duplicate
0       A   paul   john   paul       True
1       B   john   mark   tony      False
2       C    leo    leo    leo       True
3       D   tony   tony   mark       True
4       E   paul                    False
5       F   mark   mark              True
6       G                           False
Sign up to request clarification or add additional context in comments.

2 Comments

Should i replace NaN to ("") - empty strings?
@user15590480 If they are NaN or None, you can use this lambda: lambda x: len(set(x[x.notna()])) != len(x[x.notna()])

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.