Determine arguments where two numpy arrays intersect in Python

Question

I have two arrays, say:

a, b = np.array([13., 14., 15., 32., 33.]), np.array([15., 16., 17., 33., 34., 47.])

I need to find the indices of all the elements in a that are not present in b. In the above example the result would be:

[0, 1, 3]

Because a[0], a[1] and a[3] are 13., 14. and 32., which are not present in b. Notice that I don't care to know the actual values of 13., 14. and 32. (I could have used set(a).difference(set(b)), in that case). I am genuinely interested in the indices only.

If possible the answer should be "vectorized", i.e. not using a for loop.

is it just coincidence in this example, that they are both sorted arrays? (if they are sorted in the real version of your problem, you can abuse that property) — usethedeathstar
– usethedeathstar, Commented Aug 28, 2013 at 12:08
Sorry, I used sorted arrays to help reading. But I'm still interested to hear what you would do with sorted arrays :) — astabada
– astabada, Commented Aug 29, 2013 at 12:47
well, a custom algorithm might get an even better complexity by abusing the fact that they are sorted, (not really sure what complexity you would get in the end, but i assume better than whatever you do if you do not have that property) — usethedeathstar
– usethedeathstar, Commented Aug 29, 2013 at 13:58

crs17 · Accepted Answer · 2013-08-28 11:45:38Z

3

You could use np.in1d:

>>> np.arange(a.shape[0])[~np.in1d(a,b)].tolist()
  [0, 1, 3]

edited Aug 28, 2013 at 11:45

answered Aug 28, 2013 at 11:10

crs17

5512 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jabaldonedo · Accepted Answer · 2013-08-28 11:46:17Z

2

It is quite easy, use numpy.intersect1d for calculating elements shared between a and b, then check which of those elements are not in a using numpy.in1d and finally get their position in the array using numpy.argwhere.

>>> import numpy as np
>>> a, b = np.array([13., 14., 15., 32., 33.]), np.array([15., 16., 17., 33., 34., 47.])
>>> np.argwhere(np.in1d(a, np.intersect1d(a,b)) == False)
array([[0],
   [1],
   [3]])

If you prefer a list just add .flatten to convert the matrix to a vector and then apply .tolist to get the list:

>>> np.argwhere(np.in1d(a, np.intersect1d(a,b)) == False).flatten().tolist()
 [0, 1, 3]

edited Aug 28, 2013 at 11:46

answered Aug 28, 2013 at 11:39

jabaldonedo

26.7k8 gold badges80 silver badges77 bronze badges

Comments

AlexJ136 · Accepted Answer · 2013-08-28 11:27:06Z

Fairly straight forward if you use loops:

def difference_indices(a, b):

    # Set to put the unique indices in
    indices = []

    # So we know the index of the element of a that we're looking at
    a_index = 0

    for elem_a in a:

        found_in_b = False
        b_index = 0

        # Loop until we find a match. If we reach the end of b without a match, the current 
        # a index should go in the indices list
        while not found_in_b and b_index < len(b):
            if elem_a == b[b_index]: found_in_b = True
            b_index = b_index + 1

        if not found_in_b: indices.append(a_index)
        a_index = a_index + 1

    return indices

This should work with lists containing any one type, as long as they are the same type, and the __eq__ function is defined for that type.

Doing this without loops would require a knowledge of python greater than mine! Hope this is useful for you.

Collectives™ on Stack Overflow

Determine arguments where two numpy arrays intersect in Python

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related