Checking whether NumPy array is in Python list

Question

I have a list of of numpy arrays, and I would like to check whether a given array is in the list. There is some very strange behavior with this, and I'm wondering how to get around it. Here's a simple version of the problem:

import numpy as np
x = np.array([1,1])
a = [x,1]

x in a        # Returns True
(x+1) in a    # Throws ValueError
1 in a        # Throws ValueError

I don't understand what is going on here. Is there a good workaround to this problem?

I'm working with Python 3.7.

Edit: The exact error is:

ValueError: The truth value of an array with more than one element is ambiguous.  Use a.any() or a.all()

My numpy version is 1.18.1.

Can you provide the full error and the version for numpy as well? — cicolus
– cicolus, Commented May 28, 2020 at 6:36
I get a ValueError: The truth value of an array with more than one element is ambiguous. even with the first check x in a — norok2
– norok2, Commented May 28, 2020 at 6:39
@norok2 Apologies, I made some modifications to the code in the OP that I thought were unimportant, but turned out to matter. I have rectified that now. However, it makes the problem even stranger to me... — Yly
– Yly, Commented May 28, 2020 at 6:45

Serge Ballesta · Accepted Answer · 2020-05-28 07:03:24Z

1

The reason is that in is more or less interpreted as

def in_sequence(elt, seq):
    for i in seq:
        if elt == i:
            return True
    return False

And 1 == x does not give False but raises an Exception because internally numpy converts it to an array of booleans. It does make sense in most contextes but here it gives a stupid behaviour.

It sounds like a bug, but is not easy to fix. Processing 1 == np.array(1, 1) the same as np.array(1, 1) == np.array(1, 1) is a major feature of numpy. And delegating equality comparisons to classes is a major feature of Python. So I cannot even imagine what should be the correct behaviour.

TL/DR: Never mix Python lists and numpy arrays because they have very different semantics and the mix leads to inconsistent corner cases.

answered May 28, 2020 at 7:03

Serge Ballesta

150k13 gold badges137 silver badges267 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

norok2 Over a year ago

NumPy is breaking the "contract" that the result of == is a bool. It does so in the name of readability. This is mainly due to the (semi-forced) design choice of implicit broadcasting. While it also provides array_equal(), which has the semantic that Python would expect from ==. A clean solution to this would be to introduce a new operator, e.g. ===, (pretty much like it was done for @) which has the semantic of the current == and let == behave like array_equal(). But it would take 10 years for this.

norok2 Over a year ago

And I am confident that there are other places where the implicit broadcasting is lurking to show up as a bug, e.g. with the greater/smaller behavior (although their impact may not be as detrimental).

norok2 · Accepted Answer · 2020-05-28 21:11:37Z

(EDIT: to include a more general and perhaps cleaner approach)

One way around it is to implement a NumPy safe version of in:

import numpy as np


def in_np(x, items):
    for item in items:
        if isinstance(x, np.ndarray) and isinstance(item, np.ndarray) \
                and x.shape == item.shape and np.all(x == item):
            return True
        elif isinstance(x, np.ndarray) or isinstance(item, np.ndarray):
            pass
        elif x == item:
            return True
    return False

x = np.array([1, 1])
a = [x, 1]

for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
    print(in_np(y, a))
# True
# False
# True
# False
# False

Or, even better, to write a version of in with an arbitrary comparison (possibly defaulting to the default in behavior), and then make use of np.array_equal() which has a semantic that is compliant with the expected behavior for ==. In code:

import operator


def in_(x, items, eq=operator.eq):
    for item in items:
        if eq(x, item):
            return True
    return False

x = np.array([1, 1])
a = [x, 1]

for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
    print(in_(y, a, np.array_equal))
# True
# False
# True
# False
# False

Finally, note that items can be any iterable, but the complexity of the operation will not be O(1) for hashing containers like set(), although it would still be giving correct results:

print(in_(1, {1, 2, 3}))
# True
print(in_(0, {1, 2, 3}))
# False

in_(1, {1: 2, 3: 4})
# True
in_(0, {1: 2, 3: 4})
# False

William Clavier · Accepted Answer · 2020-05-28 06:57:06Z

0

You can do it like so:

import numpy as np
x = np.array([1,1])
a = np.array([x.tolist(), 1])

x in a # True
(x+1) in a # False
1 in a # True

answered May 28, 2020 at 6:57

William Clavier

3441 silver badge5 bronze badges

Comments

cicolus · Accepted Answer · 2020-05-28 06:59:02Z

0

When using x in [1,x], python will compare x with each of the elements in the list, and during the comparison x == 1, the result will be a numpy array:

>>> x == 1
array([ True,  True])

and interpreting this array as a bool value will trigger the error due to inherent ambiguity:

>>> bool(x == 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

answered May 28, 2020 at 6:59

cicolus

8579 silver badges18 bronze badges

Collectives™ on Stack Overflow

Checking whether NumPy array is in Python list

4 Answers 4

2 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related