1

For a list of sorted numpy arrays of unequal lengths (say M0, M1, M2) I want to find how many elements of each of these arrays is inside number ranges which are given by a adjoining pairs of an array (say zbin. zbin is not sorted and the said number ranges are like the following [z[0], z[1]], [z[2], z[3]], [z[4], z[5]] and so on. zbin always has an even number of elements. ) The unsorted nature of zbin and the consideration of adjoining pairs for in zbin for finding the number ranges makes this question different from the one asked here Number of elements of numpy arrays inside specific bins . In the said link, zarr was sorted and adjoining elements gave number ranges (here adjoining pairs give number ranges).

This is what I am doing presently:

""" Function to do search query """
def search(numrange, lst):
    arr = np.zeros(len(lst))        
    for i in range(len(lst)):
        probe = lst[i]
        count = 0
        for j in range(len(probe)):
            if (probe[j]>numrange[1]): break
            if (probe[j]>=numrange[0]) and (probe[j]<=numrange[1]): count = count + 1   

        arr[i] = count
    return arr


""" Some example of sorted one-dimensional arrays of unequal lengths """
M0 = np.array([5.1, 5.4, 6.4, 6.8, 7.9])
M1 = np.array([5.2, 5.7, 8.8, 8.9, 9.1, 9.2])
M2 = np.array([6.1, 6.2, 6.5, 7.2])

""" Implementation and output """
lst = [M0, M1, M2]
zbin = np.array([5.0, 5.2, 5.1, 5.3, 5.2, 5.4])

zarr = np.zeros( (len(zbin)/2, len(lst)) )
for i in np.arange(0, len(zbin)/2, 1):
    indx = i*2
    print indx
    numrange = [zbin[indx], zbin[indx+1]]
    zarr[i,:] = search(numrange, lst)

print zarr  

The output is:

[[ 1.  1.  0.]
 [ 1.  1.  0.]
 [ 1.  1.  0.]]

Here, the first row of zarr ([1,1,0] shows that M0 has 1 element in the considered number range [5.0, 5.2], M1 has 1 element and M2 has 0 elements. The second and the third rows show results for the subsequent number ranges, i.e. [5.1, 5.3] and [5.2, 5.4].)

I want to know what is the fastest way to achieve this desired functionality (zarr). In my actual task, I will be dealing with zbin of much bigger size, and many more arrays (M). I will very much appreciate any help.

3
  • 1
    Use out[::2] from the accepted soln of the linked previous question of yours? Commented May 16, 2018 at 16:46
  • 1
    Or to save on memory, initialize with out = np.empty((len(zbin)//2, len(lst)),dtype=int) and then use zbin[:-1:2] and zbin[1::2] to get those left, right indices? Commented May 16, 2018 at 16:48
  • Divakar: As like all times, this is a great answer! Many thanks :) Commented May 16, 2018 at 21:24

1 Answer 1

1

Not sure numpy would really get you any speed up, but here's an attempt:

lst = [M0, M1, M2]
zbin = np.array([5.0, 5.2, 5.1, 5.3, 5.2, 5.4])

zarr = np.zeros((len(zbin)//2, len(lst)), dtype=np.float)

for i,M in enumerate(lst):
    zarr[:,i] = np.count_nonzero(np.logical_and(M >= zbin[::2, np.newaxis],
                                                M <= zbin[1::2, np.newaxis]), axis=1)

In [10]: zarr
Out[10]: 
array([[1., 1., 0.],
       [1., 1., 0.],
       [1., 1., 0.]])

By the way, if you can exploit the sorted nature of the arrays, @Divakar solution from the linked question should definitely be faster.

Sign up to request clarification or add additional context in comments.

1 Comment

filippo: Thanks for this very nice answer! This works and the speed is as per my requirements. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.