0

I have an array matrrix Nx4 and I want to reduce it by keeping only the values that are in a specefic range for the second and third column. I have written a code that does not work because it does not take count that I am already reducing the array.

Example of data/array :

1   358 33  7.1
2   659 85  7.1
3   111 145 7.1
4   558 116 7.1
5   632 40  7.1
6   415 335 7.1
7   207 30  7.1
8   564 47  7.1
9   352 41  7.1
10  700 570 7.1
11  275 499 7.1
12  482 177 7.1
13  737 565 7.1
14  298 43  7.1
15  155 195 7.1
16  598 417 7.1
17  93  313 7.1
18  1150    597 7.1
19  410 451 7.1
20  34  793 7.1
21  997 904 7.1
22  1024    452 7.1
23  740 128 7.1
24  522 86  7.1
25  679 643 7.1
26  973 37  7.1
27  372 42  7.1

By example I want to keep the values that are in the range = [80, 2000] for the second column and in the range = [130, 2000] for the third one. My real array has over 1'000'000 rows.

Here is my code :

def filter_data(data, XRANGE, YRANGE) :

    data_f = np.copy(data)

    for l in range(len(data_f)) :

        if XRANGE[0] < data_f[l,1] < XRANGE[1] and YRANGE[0] < data_f[l,1] < YRANGE[1] :

            pass
        
        else :

            data = np.delete(data, l, axis=0)

    return data

How could I do differently and well more efficiently ?

0

2 Answers 2

1

You can pull this off by using masks and combining them by computing their point-wise products (equivalent to the AND operator with booleans):

>>> x_range, y_range = [0, 2], [0, 5]

>>> data
array([[ 1,  2,  3],
       [ 1,  1,  1],
       [ 5,  1,  7],
       [ 1, 10,  2]])

First construct two masks based on constraints on data[:, 0] and data[:, 1]:

>>> x_mask = (data[:,0] > x_range[0])*(data[:,0] < x_range[1])
array([ True,  True, False,  True])


>>> y_mask = (data[:,1] > y_range[0])*(data[:,1] < y_range[1])
array([ True,  True,  True, False])

Essentially the resulting mask is equivalent to x > x_min & x < x_max & y > y_min & y < y_max:

>>> x_mask*y_mask
array([ True,  True, False, False])

>>> data[x_mask*y_mask]
array([[1, 2, 3],
       [1, 1, 1]])
Sign up to request clarification or add additional context in comments.

Comments

0

Here is a simple example of what I think you are talking about.

I first create an example array (n,3).

Then I find where the values in the second and third columns exceed a value (lets call it 4) AND multiply this times the array with second and third column original values.

Lastly concatenate this new array to the first column from the original array as follows

a = np.asarray([[2,3,4],
                [3,4,5],
                [4,5,6],
                [10,12,14]])
val = 4
b = a[:,1:3] > val
c = a[:,1:3]*b

np.concatenate((a[:,0:1],c),axis=1)

EDIT: After you updated your example: for (n,4) array

a = np.asarray([[2,3,4,5],
                [3,4,5,6],
                [4,5,6,8],
                [10,12,14,9]])

val = 4
b = a[:,1:3] > val
c = a[:,1:3]*b

np.concatenate((a[:,0:1],c,a[:,3:4]),axis=1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.