2

I have two boolean Numpy arrays of boolean indicators:

                          v                          v              v
A =    np.array([0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1], dtype=bool)
B =    np.array([1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1], dtype=bool)
                                         ^                 ^        ^

Moving from left to right, I would like to isolate the first true A indicator, then the next true B indicator, then the next true A indicator, then the next true B indicator, etc. to end up with:

                          v                          v              v
>>>> A_result = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]
     B_result = [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1]
                                         ^                 ^        ^

I have a feeling I could create a betweenAB array indicating all the places where A==1 is followed by B==1:

                          v                          v              v
betweenAB =     [0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1]
                                         ^                 ^        ^

then take the start and end indices of each run, but I am still somewhat of a beginner when it comes to Numpy and am not sure how I might do that.

I'm looking for a fully vectorized approach as there are thousands of these arrays in my application each containing thousands of elements. Any help would be much appreciated.

1 Answer 1

1

This can barely be done efficiently with Numpy (probably not possible efficiently without loops), but easily and efficiently with the Numba's JIT. This is mainly due to the rather sequential nature of the applied operation.

Here is an example in Numba:

import numpy as np
import numba as nb

nb.jit('UniTuple(bool[::1],2)(bool[::1],bool[::1])')
def compute(A, B):
    assert len(A) == len(B)
    n = len(A)
    i = 0
    resA = np.zeros(n, dtype=bool)
    resB = np.zeros(n, dtype=bool)
    while i < n:
        while i < n and A[i] == 0:
            resA[i] = 0
            i += 1
        if i < n:
            resA[i] = 1
            if B[i] == 1:
                resB[i] = 1
                i += 1
                continue
            i += 1
        while i < n and B[i] == 0:
            resB[i] = 0
            i += 1
        if i < n:
            resB[i] = 1
            i += 1
    return resA, resB
Sign up to request clarification or add additional context in comments.

12 Comments

This is great, thank you for this. This gets me the to the betweenAB array. I have never used Numba before, would you mind recommending how you would arrive at the desired end result (second code block in the question - (2) updated A and B arrays) using this method?
Ok. I did not checked carefully the code, but looked fine for the example. Thank you for the fix. Regarding the Numba nb.jit decorator, this string specify the type of the parameter and the return type. It is optional, but it is often better to add it to compile the function code ahead of time rather than at run time. The syntax is "returnType(ParamType1, ParamType2, ...)". Basic types include int32, int64, float32... A 1D array type is described as type[:], a 2D array type[:,:] and so on. :1 can be used to specify the dimension is contiguous so that the compiled code will be faster.
You can find more information in the Numba's documentation.
Ok. If you can run it without errors and get correct results, it means it is fine (or ignored by Numba which could be possible but would be surprising). Yes, 2 is the size and Uni means the 2 items are of the same type (uniform).
I just found an answer which seems to clear this up: stackoverflow.com/a/35654032/12814841
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.