3

I know the following logical operation works with numpy:

A = np.array([True, False, True])
B = np.array([1.0, 2.0, 3.0])
C = A*B = array([1.0, 0.0, 3.0])

But the same isn't true if B is an array of strings. Is it possible to do the following:

A = np.array([True, False, True])
B = np.array(['eggs', 'milk', 'cheese'])
C = A*B = array(['eggs', '', 'cheese'])

That is a string multiplied with False should equal an empty string. Can this be done without a loop in Python (doesn't have to use numpy)?

Thanks!

3 Answers 3

8

You can use np.where for making such selection based on a mask -

np.where(A,B,'')

Sample run -

In [4]: A
Out[4]: array([ True, False,  True], dtype=bool)

In [5]: B
Out[5]: 
array(['eggs', 'milk', 'cheese'], 
      dtype='|S6')

In [6]: np.where(A,B,'')
Out[6]: 
array(['eggs', '', 'cheese'], 
      dtype='|S6')
Sign up to request clarification or add additional context in comments.

Comments

3

np.char applies string methods to elements of an array:

In [301]: np.char.multiply(B, A.astype(int))
Out[301]: 
array(['eggs', '', 'cheese'], 
      dtype='<U6')

I had to convert the boolean to integer, and place it second.

Timing in other questions indicates that np.char iterates and applies the Python methods. Speed's about the same as for list comprehension.

For in-place change, use masked assignment instead of where

In [306]: B[~A]=''
In [307]: B
Out[307]: 
array(['eggs', '', 'cheese'], 
      dtype='<U6')

Comments

2

Since strings may be multiplied by integers, and booleans are integers:

A = [True, False, True]
B = ['eggs', 'milk', 'cheese']
C = [a*b for a, b in zip(A, B)]
# C = ['eggs', '', 'cheese']

I still uses some kind of loop (same as numpy solution), but it's hidden in concise list comprehension.

Alternatively:

C = [a if b else '' for a, b in zip(A, B)]  # explicit loop may be clearer than multiply-sequence trick

2 Comments

"I still uses some kind of loop (same as numpy solution), but it's hidden in concise list comprehension." - generally, when working with NumPy, you want your loops to be happening in C, not in list comprehensions. C loops get to avoid a ton of overhead. Python loops are generally somewhere from dozens to thousands of times slower.
@user2357112 To be honest it's not clear to me whether OP uses numpy because he really needs powerful linear algebra toolbox or just because it's the only way he know to do piecewise operations. Storing strings in numpy arrays is pretty peculiar, it's not like you'll do matrix-vector multiplication with it... I just provided an alternative that doesn't have to use numpy. Feel free to up vote or down vote.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.