1

I have the following dataframe

import pandas as pd

data = {
  "a": [0, 1, 0,1],
  "b": [0,0,1,1]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

The expected output is df2

data = {
  "a": [1, 0.75, 0.25,1],
  "b": [1,0.25,0.75,1]
}

#load data into a DataFrame object:
df2 = pd.DataFrame(data)

When both a and b are same, the output dataframe should have 1. If a =0 and b=1 then output dataframe should be 0.25 and 0.75. If a =1 and b=0 then output dataframe should be 0.75 and 0.25. How to do this without for loop? Thanks in advance.

1 Answer 1

1

Use DataFrame.replace with DataFrame.mask for set 1 if same values:

df = df.replace({0:.25, 1:.75}).mask(df.std(1).eq(0), 1)
print (df)

      a     b
0  1.00  1.00
1  0.75  0.25
2  0.25  0.75
3  1.00  1.00

Another idea with broadcasting in numpy.select:

m1 = df['a'] == df['b']
m2 = (df['a'] == 0) & (df['b'] == 1)
m3 = (df['a'] == 1) & (df['b'] == 0)

masks = [m1.to_numpy()[:, None],
         m2.to_numpy()[:, None],
         m3.to_numpy()[:, None]]

df[['a','b']] = np.select(masks, [[1,1], [0.25,0.75], [0.75,0.25]])

print (df)
      a     b
0  1.00  1.00
1  0.75  0.25
2  0.25  0.75
3  1.00  1.00
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.