Remove elements from 2d Numpy array based on a list [duplicate]

Question

Thank you in advance for taking a look at my post.

I have a 2d np.array called actions with shape (2,x) which contains ints

I have another 1d np.array keys with elements of the same type to the first dimension of actions: actions[0]. I want to remove from actions all array elements which are in keys. I tried diff = actions[:, not actions[0] == kids_keys]but it returns a 3d array of (1,2,x) shape.

How can I get a (2,x) diff array back ?

For example:

actions = [[121122, 211122, 221122, ... 455544, 545544][0, 0.35, 0.75, ... 1, -0.25]]
keys = [211122 221122]
# The operation I am looking for:
actions - keys = [[121122, ... 455544, 545544][0, ... 1, -0.25]]

The error: The dimmentions of the diff array become (2,1,80) for some reason I dont know!

It would be helpful if you posted code for this, so people can see where you are in the process of trying to solve it. Sample input, and expected output would be helpful as well. — asylumax
– asylumax, Commented Jun 2, 2020 at 3:49
The duplicate is transposed, but you can either transpose your output or just operate on the right dimensions. — Mad Physicist
– Mad Physicist, Commented Jun 2, 2020 at 12:21

javidcf · Accepted Answer · 2020-06-02 16:53:38Z

2

Use np.isin:

mask = np.isin(actions[0], keys, invert=True)
result = actions[:, mask]

edited Jun 2, 2020 at 16:53

answered Jun 2, 2020 at 12:18

javidcf

59.9k7 gold badges87 silver badges134 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Alexander Ithakis Over a year ago

its seems to work great! I would like to ask what ~ means and if there is another way to do this operation faster.

javidcf Over a year ago

@AlexanderIthakis The ~ was to invert the result of np.isin (so only those not in keys are kept), but I changed it since, although inverting is a very fast operation, np.isin accepts an invert parameter to do the same more efficiently as part of the function call itself. About being faster, if I knew a better way to do it I'd post it :) np.isin is fairly fast, if all keys are always unique you can pass assume_unique=True. The masking is advanced indexing, which is relatively expensive.

Alexander Ithakis Over a year ago

Hey, I appreciate the detailed answered. Yes, my keys, are unique, I will definitely change it. Thanks again

William Gurecky · Accepted Answer · 2020-06-02 04:18:18Z

0

The following will filter out columns of actions that have a first row entry in the set keys:

import numpy as np
x = 10
actions = np.random.randint(5, size=(2,x))
print(actions)
keys = np.array([1,2,3])
print(keys)
filtered_actions = actions[:,~np.sum([actions[0,:] == key for key in keys], dtype=bool, axis=0)]
print(filtered_actions)

answered Jun 2, 2020 at 4:18

William Gurecky

1581 silver badge8 bronze badges

1 Comment

Alexander Ithakis Over a year ago

Thank you for your answer! For some reason when I try your code in a terminal it works as intended, but when I use it in my code, I get the same problem I got with my approach. It returns, a ( 2, 1, x) dimention array instead of (2, x). Can you please help me.

Collectives™ on Stack Overflow

Remove elements from 2d Numpy array based on a list [duplicate]

2 Answers 2

3 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Linked

Related