1

Using boolean series as masks is very handy in pandas.

Was wondering, if and how one could generate two-dimensional boolean arrays as masks for e.g. the where() or mask() functions to assign values to a set of cells specified by the mask.

The idea is to have a data frame and a two-dimensional boolean array of the same dimensions as the data frame and to set all cells that are True in the boolean array to value X while leaving all other data cells in the data frame untouched.

This certainly could be accomplished with a bunch of for loops stepping through the data frame and boolean array in parallel, but that does not seem very efficient or elegant.

Any pointers to the appropriate function names or tutorials would be very much appreciated.

1
  • Did you try? df.where(arr) should work as long as arr.shape==df.shape and arr is boolean type. Commented May 19, 2020 at 14:45

1 Answer 1

1

Creating a dataframe

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)

Create a boolean mask dataframe

m = {'col1': [True, True], 'col2': [True, True]} 
df_mask = pd.DataFrame(data=m)  

Create the new DataFrame based on the boolean mask, in this case, it will be the same as the original DataFrame.

masked_df = df[df_mask]

This question about selecting with complex criteria from pandas.DataFrame might help.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.