Performance issue with Python list operations

Question

I'm learning python and had an excercise where I had a list of integers and had to keep one on two numbers, then 2 on 3, then 3 on 4... And see at the end what was left. Ex : [1,2,3,4,5,6,7,8,9] --> [1,3,5,7,9] --> [1,3,7,9] --> [1,3,7]

My first try, just to see the output (beware, shockingly dirty code) :

n=input()
list2=list(range(1,int(n))) 
list1=[]
step=2
while list1 != list2 :
    countSteps=1
    position=0
    list1=list2
    list2=[]
    lenlist1=len(list1)
    while position < lenlist1 :
      while countSteps != step and position < lenlist1 :
        list2.append(list1[position])
        countSteps+=1
        position+=1
      position+=1
      countSteps = 1
    step+=1
print(list2)

The "idea" there was to use two lists and add 1/2/3... on 2/3/4... numbers from one to the other. It seems to me (maybe I'm wrong) that memory wise it was a bad choice, so I came up with the following :

n=input()
list1=list(range(1,int(n)))
lenCached=0
step=1
while lenCached!=len(list1) :
  lenCached = len(list1)
  position = step
  while position < len(list1):
    list1.pop(position)
    position+=step
  step+=1
print(list1)

I'm happy with how this last one looks but performance is horrible. When I run the first one with range(1,1000000), it takes 10s while the second one takes ages (several minutes). Why and is there something I can do about it ?

(if reading this you thought of a better way to achieve the initial goal I would of course be interested in hearing it!)

Thibaut · Accepted Answer · 2014-11-24 01:04:04Z

5

The reason why your second code is much slower is that the pop operation on lists is very slow (its running time is proportional to the length of the list instead of being constant) as explained here, unless you pop an item from the end of the list.

Here is a code which has similar performance to your first code (slightly better on my machine) but looks more idiomatic. In particular, I'm using a list comprehension instead of the series of append that you are using to construct l2 from l1:

n=input()
l = list(range(1, n))
step = 2
while step <= len(l):
    l = [num for ind, num in enumerate(l) if (ind + 1) % step != 0]
    step += 1
print(l)

A numpy implementation which is much faster:

import numpy as np

n = input()
a = np.arange(1, n)
step = 2
while step <= len(a):
    mask = np.ones(len(a), dtype=bool)
    ind = np.arange(step-1,len(a), step)
    mask[ind] = False
    a = a[mask]
    step += 1
print(a)

By the way, this process is known as the Flavius Josephus's sieve.

edited Nov 24, 2014 at 1:04

answered Nov 23, 2014 at 17:38

Thibaut

1,40811 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

yuvi Over a year ago

one small note - if using python 2.x, replace input with raw_input

Thibaut Over a year ago

@yuvi thanks for the comment. I updated my code to stay consistent with the OP's question.

user2357112 Over a year ago

"Python basically copies the whole list each time you do a pop" - what? No it doesn't. Why would it do that? It just shifts elements to the left.

Thibaut Over a year ago

@user2357112 sorry, I wasn't thinking, I removed this sentence. In practice there could still be some copying going on when the dynamic array get resized but this doesn't happen frequently.

hggjvhkkfkhftrjghvgfgj Over a year ago

Definitly a helpful answer, thanks! I'm using the latest python though, no 2.x for me. You're using a lot of function I didn't know about, which is not a problem, however in your first code I don't understand the line l = [...], what is this strange python trick called, so I can read about it ?

|

Collectives™ on Stack Overflow

Performance issue with Python list operations

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related