Python : Reduce an array by only keeping number between two limits

Question

I have an array matrrix Nx4 and I want to reduce it by keeping only the values that are in a specefic range for the second and third column. I have written a code that does not work because it does not take count that I am already reducing the array.

Example of data/array :

1   358 33  7.1
2   659 85  7.1
3   111 145 7.1
4   558 116 7.1
5   632 40  7.1
6   415 335 7.1
7   207 30  7.1
8   564 47  7.1
9   352 41  7.1
10  700 570 7.1
11  275 499 7.1
12  482 177 7.1
13  737 565 7.1
14  298 43  7.1
15  155 195 7.1
16  598 417 7.1
17  93  313 7.1
18  1150    597 7.1
19  410 451 7.1
20  34  793 7.1
21  997 904 7.1
22  1024    452 7.1
23  740 128 7.1
24  522 86  7.1
25  679 643 7.1
26  973 37  7.1
27  372 42  7.1

By example I want to keep the values that are in the range = [80, 2000] for the second column and in the range = [130, 2000] for the third one. My real array has over 1'000'000 rows.

Here is my code :

def filter_data(data, XRANGE, YRANGE) :

    data_f = np.copy(data)

    for l in range(len(data_f)) :

        if XRANGE[0] < data_f[l,1] < XRANGE[1] and YRANGE[0] < data_f[l,1] < YRANGE[1] :

            pass
        
        else :

            data = np.delete(data, l, axis=0)

    return data

How could I do differently and well more efficiently ?

Ivan · Accepted Answer · 2021-07-29 14:38:14Z

1

You can pull this off by using masks and combining them by computing their point-wise products (equivalent to the AND operator with booleans):

>>> x_range, y_range = [0, 2], [0, 5]

>>> data
array([[ 1,  2,  3],
       [ 1,  1,  1],
       [ 5,  1,  7],
       [ 1, 10,  2]])

First construct two masks based on constraints on data[:, 0] and data[:, 1]:

>>> x_mask = (data[:,0] > x_range[0])*(data[:,0] < x_range[1])
array([ True,  True, False,  True])


>>> y_mask = (data[:,1] > y_range[0])*(data[:,1] < y_range[1])
array([ True,  True,  True, False])

Essentially the resulting mask is equivalent to x > x_min & x < x_max & y > y_min & y < y_max:

>>> x_mask*y_mask
array([ True,  True, False, False])

>>> data[x_mask*y_mask]
array([[1, 2, 3],
       [1, 1, 1]])

answered Jul 29, 2021 at 14:38

Ivan

41.3k9 gold badges78 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

amstergc20 · Accepted Answer · 2021-07-29 14:49:15Z

Here is a simple example of what I think you are talking about.

I first create an example array (n,3).

Then I find where the values in the second and third columns exceed a value (lets call it 4) AND multiply this times the array with second and third column original values.

Lastly concatenate this new array to the first column from the original array as follows

a = np.asarray([[2,3,4],
                [3,4,5],
                [4,5,6],
                [10,12,14]])
val = 4
b = a[:,1:3] > val
c = a[:,1:3]*b

np.concatenate((a[:,0:1],c),axis=1)

EDIT: After you updated your example: for (n,4) array

a = np.asarray([[2,3,4,5],
                [3,4,5,6],
                [4,5,6,8],
                [10,12,14,9]])

val = 4
b = a[:,1:3] > val
c = a[:,1:3]*b

np.concatenate((a[:,0:1],c,a[:,3:4]),axis=1)

Collectives™ on Stack Overflow

Python : Reduce an array by only keeping number between two limits

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related