2

Is there a way to find if a list contains duplicates. For example:

list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]

list1.*method* = False # no duplicates
list2.*method* = True # contains duplicates
3
  • 1
    Is this assuming the lists are always sorted? Commented Jun 28, 2012 at 17:25
  • Possible duplicate: stackoverflow.com/questions/1920145/… Commented Jun 28, 2012 at 17:27
  • 1
    @tyjkenn: Checking for existence of duplicates is simpler than finding the actual duplicates (which is what the other question is about). Commented Jun 28, 2012 at 17:30

4 Answers 4

14

If you convert the list to a set temporarily, that will eliminate the duplicates in the set. You can then compare the lengths of the list and set.

In code, it would look like this:

list1 = [...]
tmpSet = set(list1)
haveDuplicates = len(list1) != len(tmpSet)
Sign up to request clarification or add additional context in comments.

2 Comments

+1 for including some actual text to explain what you are doing as opposed to just plopping down code.
@jdi: I actually tried to just plop down some code originally but it came under the 30 characters minimum.
2

Convert the list to a set to remove duplicates. Compare the lengths of the original list and the set to see if any duplicates existed.

>>> list1 = [1,2,3,4,5]
>>> list2 = [1,1,2,3,4,5]
>>> len(list1) == len(set(list1))
True # no duplicates
>>> len(list2) == len(set(list2))
False # duplicates

Comments

2

Check if the length of the original list is larger than the length of the unique "set" of elements in the list. If so, there must have been duplicates

list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]

if len(list1) != len(set(list1)):
    #duplicates

Comments

0

The set() approach only works for hashable objects, so for completness, you could do it with just plain iteration:

import itertools

def has_duplicates(iterable):
    """
    >>> has_duplicates([1,2,3])
    False
    >>> has_duplicates([1, 2, 1])
    True
    >>> has_duplicates([[1,1], [3,2], [4,3]])
    False
    >>> has_duplicates([[1,1], [3,2], [4,3], [4,3]])
    True
    """
    return any(x == y for x, y in itertools.combinations(iterable, 2))

4 Comments

Ouch. This one hurts for complexity. Better to write hash functions for your unhashable objects.
@JoelCornett Mind doing it for list ?
listHash = lambda x: hash(tuple(x)). Note that since this hash is just a one-time thing, you don't have to worry about objects mutating on you.
Here's a simpler one: lambda x: 1. Creating such a function doesn't make list objects any more hashable, 'cause list.__hash__ is still None. As for efficiency, you can easily tweak this to take constant extra memory. Hashing is just a CPU/memory tradeoff.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.