1

I am looking for the equivalent of an SQL 'where' query over a table. I have done a lot of searching and I'm either using the wrong search terms or not understanding the answers. Probably both.

So a table is a 2 dimensional numpy array.

my_array = np.array([[32, 55,  2],
                     [15,  2, 60], 
                     [76, 90,  2], 
                     [ 6, 65,  2]])

I wish to 'end up' with a numpy array of the same shape where eg the second column values are >= 55 AND <= 65.

So my desired numpy array would be...

desired_array([[32, 55,  2],
               [ 6, 65,  2]])

Also, does 'desired_array' order match 'my_array' order?

4 Answers 4

4

Just make mask and use it.

mask = np.logical_and(my_array[:, 1] >= 55, my_array[:, 1] <= 65)
desired_array = my_array[mask]
desired_array
Sign up to request clarification or add additional context in comments.

1 Comment

Gilseung Ahn Wow! that was quick and correct. Thank you so much. I never came across anything close to that in my copious searches.
0

The general Numpy approach to filtering an array is to create a "mask" that matches the desired part of the array, and then use it to index in.

>>> my_array[((55 <= my_array) & (my_array <= 65))[:, 1]]
array([[32, 55,  2],
       [ 6, 65,  2]])

Breaking it down:

# Comparing an array to a scalar gives you an array of all the results of
# individual element comparisons (this is called "broadcasting").
# So we take two such boolean arrays, resulting from comparing values to the
# two thresholds, and combine them together.
mask = (55 <= my_array) & (my_array <= 65)

# We only want to care about the [1] element in the second array dimension,
# so we take a 1-dimensional slice of that mask.
desired_rows = mask[:, 1]

# Finally we use those values to select the desired rows.
desired_array = my_array[desired_rows]

(The first two operations could instead be swapped - that way I imagine is more efficient, but it wouldn't matter for something this small. This way is the way that occurred to me first.)

Comments

0

You dont mean the same shape. You probably meant the same column size. The shape of my_array is (4, 3) and the shape of your desired array is (2, 3). I would recommend masking, too.

Comments

-1

You can use a filter statement with a lambda that checks each row for the desired condition to get the desired result:

my_array = np.array([[32, 55,  2],
                     [15,  2, 60], 
                     [76, 90,  2], 
                     [ 6, 65,  2]])

desired_array = np.array([l for l in filter(lambda x: x[1] >= 55 and x[1] <= 65, my_array)])

Upon running this, we get:

>>> desired_array
array([[32, 55,  2],
       [ 6, 65,  2]])

1 Comment

If you are using list comprehensions to operate on a numpy array, you are defeating the purpose of using numpy.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.