How to make lists contain only distinct element in Python? [duplicate]

Question

I have a list in Python, how can I make it's values unique?

Please fix the title of your question. You're not talking about make lists distinct. You're talking about making list items distinct. — S.Lott
– S.Lott, Commented Dec 16, 2010 at 11:31
Why do you need list in the first place? Maybe set() or dict() are enough. — Paweł Prażak
– Paweł Prażak, Commented Dec 16, 2010 at 14:22
Possible duplicate of Removing duplicates in lists or stackoverflow.com/questions/480214/… — Ciro Santilli OurBigBook.com
– Ciro Santilli OurBigBook.com, Commented Dec 18, 2017 at 14:25

Mark Byers · Accepted Answer · 2010-12-16 10:29:27Z

404

The simplest is to convert to a set then back to a list:

my_list = list(set(my_list))

One disadvantage with this is that it won't preserve the order. You may also want to consider if a set would be a better data structure to use in the first place, instead of a list.

answered Dec 16, 2010 at 10:29

Mark Byers

843k202 gold badges1.6k silver badges1.5k bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Ant Over a year ago

i am wrong or with python3k the values will be preserved, cause set now are sorted?

Mark Over a year ago

@Ant Dictionary key order is preserved from Python 3.6, but it says "the order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon". Since they're both based on hashes, I'd think set would be the same, but it's not mentioned, so apparently not: docs.python.org/3.6/whatsnew/3.6.html

Sky Over a year ago

Preserve order and functional way: In [23]: from functools import reduce In [24]: reduce(lambda acc,elem: acc+[elem] if not elem in acc else acc , [2,1,2,3,3,3,4,5], []) Out[24]: [2, 1, 3, 4, 5]

Inaam Ilahi Over a year ago

The order of the list gets lost this way

Aiden Cullo Over a year ago

worth mentioning that this doesn't work if the list contains a list.

Paweł Prażak · Accepted Answer · 2011-04-10 06:44:20Z

34

Modified versions of http://www.peterbe.com/plog/uniqifiers-benchmark

To preserve the order:

def f(seq): # Order preserving
  ''' Modified version of Dave Kirby solution '''
  seen = set()
  return [x for x in seq if x not in seen and not seen.add(x)]

OK, now how does it work, because it's a little bit tricky here if x not in seen and not seen.add(x):

In [1]: 0 not in [1,2,3] and not print('add')
add
Out[1]: True

Why does it return True? print (and set.add) returns nothing:

In [3]: type(seen.add(10))
Out[3]: <type 'NoneType'>

and not None == True, but:

In [2]: 1 not in [1,2,3] and not print('add')
Out[2]: False

Why does it print 'add' in [1] but not in [2]? See False and print('add'), and doesn't check the second argument, because it already knows the answer, and returns true only if both arguments are True.

More generic version, more readable, generator based, adds the ability to transform values with a function:

def f(seq, idfun=None): # Order preserving
  return list(_f(seq, idfun))

def _f(seq, idfun=None):  
  ''' Originally proposed by Andrew Dalke '''
  seen = set()
  if idfun is None:
    for x in seq:
      if x not in seen:
        seen.add(x)
        yield x
  else:
    for x in seq:
      x = idfun(x)
      if x not in seen:
        seen.add(x)
        yield x

Without order (it's faster):

def f(seq): # Not order preserving
  return list(set(seq))

edited Apr 10, 2011 at 6:44

answered Dec 16, 2010 at 17:08

Paweł Prażak

3,2111 gold badge30 silver badges42 bronze badges

1 Comment

Paweł Prażak Over a year ago

sort of inner helper function (there was a bug in the code, should be _f instead of _f10 on line 2, thanks for spotting)

brillout · Accepted Answer · 2014-07-21 12:44:05Z

31

one-liner and preserve order

list(OrderedDict.fromkeys([2,1,1,3]))

although you'll need

from collections import OrderedDict

answered Jul 21, 2014 at 12:44

brillout

7,47815 gold badges81 silver badges95 bronze badges

3 Comments

Danny Staple Over a year ago

An alternative form is: OrderedDict.fromkeys(my_list).keys()

Mark Over a year ago

@DannyStaple: that works in python 2, but in python 3 it returns a view of the dictionary keys, which might be okay for some purposes, but doesn't support indexing for example.

Danny Staple Over a year ago

The initial one liner will work. The aternative form returns an odict_keys type, which is less useful for this - but can still be converted to a list.

disp_name · Accepted Answer · 2014-08-11 05:28:47Z

18

Let me explain to you by an example:

if you have Python list

>>> randomList = ["a","f", "b", "c", "d", "a", "c", "e", "d", "f", "e"]

and you want to remove duplicates from it.

>>> uniqueList = []

>>> for letter in randomList:
    if letter not in uniqueList:
        uniqueList.append(letter)

>>> uniqueList
['a', 'f', 'b', 'c', 'd', 'e']

This is how you can remove duplicates from the list.

answered Aug 11, 2014 at 5:28

disp_name

1,4982 gold badges25 silver badges49 bronze badges

2 Comments

Claude Over a year ago

+1 because it's the only one that works for types that are unhashable, but do have an eq function (if your types are hashable, use one of the other solutions). Note that it will be slow for very big lists.

Yingbo Miao Over a year ago

Unless in some special case as Claude explained, this one has the worst performance: O(n^2)

khachik · Accepted Answer · 2010-12-16 10:39:29Z

15

To preserve the order:

l = [1, 1, 2, 2, 3]
result = list()
map(lambda x: not x in result and result.append(x), l)
result
# [1, 2, 3]

edited Dec 16, 2010 at 10:39

answered Dec 16, 2010 at 10:32

khachik

28.8k10 gold badges63 silver badges98 bronze badges

3 Comments

Broken_Window Over a year ago

In python 3.4 returns an empty list!!!

user2389519 Over a year ago

map just creates map object (generator), does not execute it. list(map(....)) forces the execution

eg04lt3r Over a year ago

It will produce not optimal performance, due to traversal the result list for each x.

cod3monk3y · Accepted Answer · 2015-01-28 17:08:03Z

11

How about dictionary comprehensions?

>>> mylist = [3, 2, 1, 3, 4, 4, 4, 5, 5, 3]

>>> {x:1 for x in mylist}.keys()
[1, 2, 3, 4, 5]

EDIT To @Danny's comment: my original suggestion does not keep the keys ordered. If you need the keys sorted, try:

>>> from collections import OrderedDict

>>> OrderedDict( (x,1) for x in mylist ).keys()
[3, 2, 1, 4, 5]

which keeps elements in the order by the first occurrence of the element (not extensively tested)

edited Jan 28, 2015 at 17:08

answered Jan 14, 2014 at 6:54

cod3monk3y

9,8946 gold badges44 silver badges54 bronze badges

2 Comments

Danny Staple Over a year ago

This would not preserve order - dictionary order (and set order) is determined by the hashing algorithm and not insertion order. I am not sure of the effects of a dictionary comprehension with an OrderedDict type though.

cod3monk3y Over a year ago

@DannyStaple True. I added an example using OrderedDict and a generator, if ordered output is desired.

Tung Nguyen · Accepted Answer · 2016-05-31 15:14:13Z

5

The characteristics of sets in Python are that the data items in a set are unordered and duplicates are not allowed. If you try to add a data item to a set that already contains the data item, Python simply ignores it.

>>> l = ['a', 'a', 'bb', 'b', 'c', 'c', '10', '10', '8','8', 10, 10, 6, 10, 11.2, 11.2, 11, 11]
>>> distinct_l = set(l)
>>> print(distinct_l)
set(['a', '10', 'c', 'b', 6, 'bb', 10, 11, 11.2, '8'])

answered May 31, 2016 at 15:14

Tung Nguyen

1,6662 gold badges21 silver badges13 bronze badges

Comments

Seitaridis · Accepted Answer · 2010-12-16 12:03:48Z

4

If all elements of the list may be used as dictionary keys (i.e. they are all hashable) this is often faster. Python Programming FAQ

d = {}
for x in mylist:
    d[x] = 1
mylist = list(d.keys())

edited Dec 16, 2010 at 12:03

answered Dec 16, 2010 at 11:57

Seitaridis

4,5379 gold badges56 silver badges86 bronze badges

Comments

sje397 · Accepted Answer · 2010-12-16 10:29:31Z

2

From http://www.peterbe.com/plog/uniqifiers-benchmark:

def f5(seq, idfun=None):  
    # order preserving
    if idfun is None:
        def idfun(x): return x
    seen = {}
    result = []
    for item in seq:
        marker = idfun(item)
        # in old Python versions:
        # if seen.has_key(marker)
        # but in new ones:
        if marker in seen: continue
        seen[marker] = 1
        result.append(item)
    return result

answered Dec 16, 2010 at 10:29

sje397

41.9k9 gold badges90 silver badges106 bronze badges

3 Comments

Thomas K Over a year ago

Wouldn't it make sense to use a set for seen, rather than a dict?

brildum Over a year ago

In Python, sets and dicts are built using hashtables so they are interchangeable in this scenario. They both provide the same operations (limiting duplicates) and both have the same running time.

Paweł Prażak Over a year ago

This one is slow, generator version is much faster

adam.lofts · Accepted Answer · 2013-12-16 10:09:58Z

2

The simplest way to remove duplicates whilst preserving order is to use collections.OrderedDict (Python 2.7+).

from collections import OrderedDict
d = OrderedDict()
for x in mylist:
    d[x] = True
print d.iterkeys()

answered Dec 16, 2013 at 10:09

adam.lofts

1,1629 silver badges12 bronze badges

Collectives™ on Stack Overflow

How to make lists contain only distinct element in Python? [duplicate]

10 Answers 10

5 Comments

1 Comment

3 Comments

2 Comments

3 Comments

2 Comments

Comments

Comments

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

5 Comments

1 Comment

3 Comments

2 Comments

3 Comments

2 Comments

Comments

Comments

3 Comments

Comments

Linked

Related