6

I have two arrays, say:

a, b = np.array([13., 14., 15., 32., 33.]), np.array([15., 16., 17., 33., 34., 47.])

I need to find the indices of all the elements in a that are not present in b. In the above example the result would be:

[0, 1, 3]

Because a[0], a[1] and a[3] are 13., 14. and 32., which are not present in b. Notice that I don't care to know the actual values of 13., 14. and 32. (I could have used set(a).difference(set(b)), in that case). I am genuinely interested in the indices only.

If possible the answer should be "vectorized", i.e. not using a for loop.

3
  • is it just coincidence in this example, that they are both sorted arrays? (if they are sorted in the real version of your problem, you can abuse that property) Commented Aug 28, 2013 at 12:08
  • Sorry, I used sorted arrays to help reading. But I'm still interested to hear what you would do with sorted arrays :) Commented Aug 29, 2013 at 12:47
  • well, a custom algorithm might get an even better complexity by abusing the fact that they are sorted, (not really sure what complexity you would get in the end, but i assume better than whatever you do if you do not have that property) Commented Aug 29, 2013 at 13:58

3 Answers 3

3

You could use np.in1d:

>>> np.arange(a.shape[0])[~np.in1d(a,b)].tolist()
  [0, 1, 3]
Sign up to request clarification or add additional context in comments.

Comments

2

It is quite easy, use numpy.intersect1d for calculating elements shared between a and b, then check which of those elements are not in a using numpy.in1d and finally get their position in the array using numpy.argwhere.

>>> import numpy as np
>>> a, b = np.array([13., 14., 15., 32., 33.]), np.array([15., 16., 17., 33., 34., 47.])
>>> np.argwhere(np.in1d(a, np.intersect1d(a,b)) == False)
array([[0],
   [1],
   [3]])

If you prefer a list just add .flatten to convert the matrix to a vector and then apply .tolist to get the list:

>>> np.argwhere(np.in1d(a, np.intersect1d(a,b)) == False).flatten().tolist()
 [0, 1, 3]

Comments

1

Fairly straight forward if you use loops:

def difference_indices(a, b):

    # Set to put the unique indices in
    indices = []

    # So we know the index of the element of a that we're looking at
    a_index = 0

    for elem_a in a:

        found_in_b = False
        b_index = 0

        # Loop until we find a match. If we reach the end of b without a match, the current 
        # a index should go in the indices list
        while not found_in_b and b_index < len(b):
            if elem_a == b[b_index]: found_in_b = True
            b_index = b_index + 1

        if not found_in_b: indices.append(a_index)
        a_index = a_index + 1

    return indices

This should work with lists containing any one type, as long as they are the same type, and the __eq__ function is defined for that type.

Doing this without loops would require a knowledge of python greater than mine! Hope this is useful for you.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.