Compare values of multiple pandas columns

Question

let's say I have four columns with strings in each column (pandas df). If I want to compare if they are all the same, I came up with something like this:

df['same_FB'] = np.where( (df['FB_a'] == df['FB_b']) & (df['FB_a'] == df['FB_c']) & (df['FB_a'] == df['FB_d']), 1,0)

It works fine, but it doesn't look good and if I had to add a fifth or sixth column it get's even uglier. Is there another way to test if all columns are the same? Alternatively, I would be ok with counting the distinct values in these four columns.

Shubham Sharma · Accepted Answer · 2021-01-29 16:01:45Z

2

You can use DataFrame.eq + DataFrame.all:

x,*y = ['FB_a', 'FB_b', 'Fb_c', 'FB_d']
df['same_FB'] = df[y].eq(df[x], axis=0).all(1).view('i1')

Alternatively you can use nunique:

c = ['FB_a', 'FB_b', 'Fb_c', 'FB_d']
df['same_FB'] = df[c].nunique(axis=1, dropna=False).eq(1).view('i1')

Example:

print(df)

    A  B  C  D  E
0  10  1  1  1  1
1  20  2  2  2  2
2  30  3  3  3  3
3  40  4  4  4  4

x,*y = ['B', 'C', 'D', 'E']
df['same'] = df[y].eq(df[x], axis=0).all(1).view('i1')

print(df)

    A  B  C  D  E  same
0  10  1  1  1  1     1
1  20  2  2  2  2     1
2  30  3  3  3  3     1
3  40  4  4  4  4     1

edited Jan 29, 2021 at 16:01

answered Jan 29, 2021 at 15:45

Shubham Sharma

71.8k6 gold badges26 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

cvluepke Over a year ago

Thank you for your help. It works, but also I would like to understand it. What does the '1' do inside all()? Why do you need that?

Shubham Sharma Over a year ago

@cvluepke It specify the condition if all values along axis=1 are truthy. For more clarity you can write it as .all(axis=1)..

cvluepke Over a year ago

Ok got it. Thanks!

DapperDuck · Accepted Answer · 2021-01-29 15:40:59Z

1

You can use chained python logic. Here is the code:

df['same_FB'] = np.where((df['FB_a'] == df['FB_b'] == df['FB_c'] == df['FB_d']), 1,0)

answered Jan 29, 2021 at 15:40

DapperDuck

2,8741 gold badge11 silver badges26 bronze badges

1 Comment

cvluepke Over a year ago

Thank you. That makes it look less complex.

Collectives™ on Stack Overflow

Compare values of multiple pandas columns

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related