Is there a way to find if a list contains duplicates. For example:
list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]
list1.*method* = False # no duplicates
list2.*method* = True # contains duplicates
Is there a way to find if a list contains duplicates. For example:
list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]
list1.*method* = False # no duplicates
list2.*method* = True # contains duplicates
If you convert the list to a set temporarily, that will eliminate the duplicates in the set. You can then compare the lengths of the list and set.
In code, it would look like this:
list1 = [...]
tmpSet = set(list1)
haveDuplicates = len(list1) != len(tmpSet)
Convert the list to a set to remove duplicates. Compare the lengths of the original list and the set to see if any duplicates existed.
>>> list1 = [1,2,3,4,5]
>>> list2 = [1,1,2,3,4,5]
>>> len(list1) == len(set(list1))
True # no duplicates
>>> len(list2) == len(set(list2))
False # duplicates
The set() approach only works for hashable objects, so for completness, you could do it with just plain iteration:
import itertools
def has_duplicates(iterable):
"""
>>> has_duplicates([1,2,3])
False
>>> has_duplicates([1, 2, 1])
True
>>> has_duplicates([[1,1], [3,2], [4,3]])
False
>>> has_duplicates([[1,1], [3,2], [4,3], [4,3]])
True
"""
return any(x == y for x, y in itertools.combinations(iterable, 2))
list ?listHash = lambda x: hash(tuple(x)). Note that since this hash is just a one-time thing, you don't have to worry about objects mutating on you.lambda x: 1. Creating such a function doesn't make list objects any more hashable, 'cause list.__hash__ is still None. As for efficiency, you can easily tweak this to take constant extra memory. Hashing is just a CPU/memory tradeoff.