3

I have a list of of numpy arrays, and I would like to check whether a given array is in the list. There is some very strange behavior with this, and I'm wondering how to get around it. Here's a simple version of the problem:

import numpy as np
x = np.array([1,1])
a = [x,1]

x in a        # Returns True
(x+1) in a    # Throws ValueError
1 in a        # Throws ValueError

I don't understand what is going on here. Is there a good workaround to this problem?

I'm working with Python 3.7.

Edit: The exact error is:

ValueError: The truth value of an array with more than one element is ambiguous.  Use a.any() or a.all()

My numpy version is 1.18.1.

4
  • Can you provide the full error and the version for numpy as well? Commented May 28, 2020 at 6:36
  • 2
    I get a ValueError: The truth value of an array with more than one element is ambiguous. even with the first check x in a Commented May 28, 2020 at 6:39
  • @cicolus Edited. Commented May 28, 2020 at 6:39
  • @norok2 Apologies, I made some modifications to the code in the OP that I thought were unimportant, but turned out to matter. I have rectified that now. However, it makes the problem even stranger to me... Commented May 28, 2020 at 6:45

4 Answers 4

1

The reason is that in is more or less interpreted as

def in_sequence(elt, seq):
    for i in seq:
        if elt == i:
            return True
    return False

And 1 == x does not give False but raises an Exception because internally numpy converts it to an array of booleans. It does make sense in most contextes but here it gives a stupid behaviour.

It sounds like a bug, but is not easy to fix. Processing 1 == np.array(1, 1) the same as np.array(1, 1) == np.array(1, 1) is a major feature of numpy. And delegating equality comparisons to classes is a major feature of Python. So I cannot even imagine what should be the correct behaviour.

TL/DR: Never mix Python lists and numpy arrays because they have very different semantics and the mix leads to inconsistent corner cases.

Sign up to request clarification or add additional context in comments.

2 Comments

NumPy is breaking the "contract" that the result of == is a bool. It does so in the name of readability. This is mainly due to the (semi-forced) design choice of implicit broadcasting. While it also provides array_equal(), which has the semantic that Python would expect from ==. A clean solution to this would be to introduce a new operator, e.g. ===, (pretty much like it was done for @) which has the semantic of the current == and let == behave like array_equal(). But it would take 10 years for this.
And I am confident that there are other places where the implicit broadcasting is lurking to show up as a bug, e.g. with the greater/smaller behavior (although their impact may not be as detrimental).
1

(EDIT: to include a more general and perhaps cleaner approach)

One way around it is to implement a NumPy safe version of in:

import numpy as np


def in_np(x, items):
    for item in items:
        if isinstance(x, np.ndarray) and isinstance(item, np.ndarray) \
                and x.shape == item.shape and np.all(x == item):
            return True
        elif isinstance(x, np.ndarray) or isinstance(item, np.ndarray):
            pass
        elif x == item:
            return True
    return False
x = np.array([1, 1])
a = [x, 1]

for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
    print(in_np(y, a))
# True
# False
# True
# False
# False

Or, even better, to write a version of in with an arbitrary comparison (possibly defaulting to the default in behavior), and then make use of np.array_equal() which has a semantic that is compliant with the expected behavior for ==. In code:

import operator


def in_(x, items, eq=operator.eq):
    for item in items:
        if eq(x, item):
            return True
    return False
x = np.array([1, 1])
a = [x, 1]

for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
    print(in_(y, a, np.array_equal))
# True
# False
# True
# False
# False

Finally, note that items can be any iterable, but the complexity of the operation will not be O(1) for hashing containers like set(), although it would still be giving correct results:

print(in_(1, {1, 2, 3}))
# True
print(in_(0, {1, 2, 3}))
# False

in_(1, {1: 2, 3: 4})
# True
in_(0, {1: 2, 3: 4})
# False

Comments

0

You can do it like so:

import numpy as np
x = np.array([1,1])
a = np.array([x.tolist(), 1])

x in a # True
(x+1) in a # False
1 in a # True

Comments

0

When using x in [1,x], python will compare x with each of the elements in the list, and during the comparison x == 1, the result will be a numpy array:

>>> x == 1
array([ True,  True])

and interpreting this array as a bool value will trigger the error due to inherent ambiguity:

>>> bool(x == 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.