4

This question is related to (but not the same as) "numpy.unique generates a list unique in what regard?"

The setup:

import numpy as np
from functools import total_ordering

@total_ordering
class UniqueObject(object):
    def __init__(self, a):
        self.a = a
    def __eq__(self, other):
        return self.a == other.a
    def __lt__(self, other):
        return self.a < other.a
    def __hash__(self):
        return hash(self.a)
    def __str__(self):
        return "UniqueObject({})".format(self.a)
    def __repr__(self):
        return self.__str__()

Expected behaviour of np.unique:

>>> np.unique([1, 1, 2, 2])
array([1, 2])
>>> np.unique(np.array([1, 1, 2, 2]))
array([1, 2])
>>> np.unique(map(UniqueObject, [1, 1, 2, 2]))
array([UniqueObject(1), UniqueObject(2)], dtype=object)

Which is no problem, it works. But this doesn't work as expected:

>>> np.unique(np.array(map(UniqueObject, [1, 1, 2, 2])))
array([UniqueObject(1), UniqueObject(1), UniqueObject(2), UniqueObject(2)], dtype=object)

How come np.array with dtype=object is handled differently than a python list with objects?

That is:

objs = map(UniqueObject, [1, 1, 2, 2])
np.unique(objs) != np.unique(np.array(objs)) #?

I'm running numpy 1.8.0.dev-74b08b3 and Python 2.7.3

1 Answer 1

3

Following through the source of np.unique, it seems that the branch which is actually taken is

else:
    ar.sort()
    flag = np.concatenate(([True], ar[1:] != ar[:-1]))
    return ar[flag]

which simply sorts the terms and then takes the ones which aren't equal to the previous one. But shouldn't that work?.. oops. This is on me. Your original code defined __ne__, and I accidentally removed it when removing the comparisons being total_ordering-ed.

>>> UniqueObject(1) == UniqueObject(1)
True
>>> UniqueObject(1) != UniqueObject(1)
True

Putting __ne__ back in:

>>> UniqueObject(1) != UniqueObject(1)
False
>>> np.array(map(UniqueObject, [1,1,2,2]))
array([UniqueObject(1), UniqueObject(1), UniqueObject(2), UniqueObject(2)], dtype=object)
>>> np.unique(np.array(map(UniqueObject, [1,1,2,2])))
array([UniqueObject(1), UniqueObject(2)], dtype=object)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.