1

I need to evaluate many logical conditions on a large 2D "NUMPY" array, and collect the overall result in a boolean "RESULT" numpy array.

A simple example where all conditions are linked with an AND statement could be:

RESULT= cond1(NUMPY) & cond2(NUMPY) & cond3(NUMPY) & ....

I would like to understand if there is a way to optimize performance.

For example in this case, if the first condition (cond1) is False for most of the values in the NUMPY array it will be a waste of resources evaluating all other conditions on those values since the AND conditions will anyway generate a False in the final RESULT array.

Any ideas?

2
  • Python and and or short circuit, but only for scalar conditions. With whole numpy whole-array operations, each condition is evaluated, and then the values are combined. You'd have to use numba or cython to construct a faster iterative test that implements short-circuiting. Commented Mar 24, 2019 at 15:38
  • Thank you for the explanation and suggestion, I am not so familiar with numba and cython yet, but I will look into those if I do not find another way :) Commented Mar 24, 2019 at 19:48

1 Answer 1

1

You can do the short-circuiting by hand, though I should add that this is probably only worth it in rather extreme cases.

Here is an example of 99 chained logical ands. The short circuiting is done either using the where keyword or using fancy indexing. The second but not the first gives a decent speed up for this example.

import numpy as np

a = np.random.random((1000,))*1.5
c = np.random.random((100, 1))*1.5

def direct():
    return ((a+c) < np.arccos(np.cos(a+c)*0.99)).all(0)

def trickya():
    out = np.ones(a.shape, '?')
    for ci in c:
        np.logical_and(out, np.less(np.add(a, ci, where=out), np.arccos(np.multiply(np.cos(np.add(a, ci, where=out), where=out), 0.99, where=out), where=out), where=out), out=out, where=out)
    return out

def trickyb():
    idx, = np.where((a+c[0]) < np.arccos(np.cos(a+c[0])*0.99))
    for ci in c[1:]:
        idx = idx[(a[idx]+ci) < np.arccos(np.cos(a[idx]+ci)*0.99)]
    out = np.zeros(a.shape, '?')
    out[idx] = True
    return out

assert (direct()==trickya()).all()
assert (direct()==trickyb()).all()

from timeit import timeit

print('direct  ', timeit(direct, number=100))
print('where kw', timeit(trickya, number=100))
print('indexing', timeit(trickyb, number=100))

Sample run:

direct   0.49512664100620896
where kw 0.494946873979643
indexing 0.17760096595156938
Sign up to request clarification or add additional context in comments.

1 Comment

Very interesting! I will try and have a look if I can use a similar way to speed up my code :). Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.