1

I have a dataframe df in pandas with the next data structure:

+------+------+
| col1 | col2 |
+------+------+
|  2   |  1   |
+------+------+
|  nan |  3   |
+------+------+
|  nan | nan  |
+------+------+

There are some values for columns col1 and col2 that are nan, and the others are Integers. What I would like to do is to add a new column col3 to my df dataframe where the values of the other columns are considered.

In this case, if col1 and col2 values for a row are integer values, the new value of col3 would be 0. If col1 is nan and col2 is not nan, col3 would have 1 value. And finally, if both col1 and col2 are nan, col3 would have 2 value.

How can I do it?

1 Answer 1

1

Use numpy.select with conditions, default values is if col1 is not NaN and col2 is NaN values:

df = pd.DataFrame({'col1':[2, np.nan, np.nan, 5],
                   'col2':[1,3,np.nan, np.nan]})

m1 = df['col1'].isna()
m2 = df['col2'].isna()

df['out'] = np.select([~m1 & ~m2, m1 & ~m2, m1 & m2], [0,1,2], default=3)
print (df)
   col1  col2  out
0   2.0   1.0    0
1   NaN   3.0    1
2   NaN   NaN    2
3   5.0   NaN    3
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.