I have run into an oddity with creating nested lists in Python 2.6.6.
Consider the following two functions:
def lists(n):
start_time = time.time()
lists = [None]*n
for i in xrange(n):
lists[i] = [None]*n
for j in xrange(n):
lists[i][j] = []
print time.time() - start_time
def simple_lists(n):
start_time = time.time()
lists = [None]*n
for i in xrange(n):
lists[i] = [None]*n
for j in xrange(n):
lists[i][j] = False
print time.time() - start_time
They both allocate an array of size n*n. One assigns all values as "False", and one assigns all values as empty lists. They should both run in O(n^2) as far as I can see. However, this does not seem the case... Observe the following test runs:
>>> for i in [4000, 8000, 16000]: simple_lists(i)
2.11170578003
8.67467808723
34.0958559513
>>> for i in [1000, 2000, 4000]: lists(i)
1.13742399216
7.39806008339
78.0808939934
As you can see, simple_lists() seems to grow roughly O(n^2), while lists() seems to grow over O(n^3)!
What's going on here? This quirk is completely and utterly ruining the complexity of my code, and I can't explain why it's behaving like this. Does anyone have any ideas?
Edit: List comprehensions seem to cause the same complexity issues.
Redefining lists() like
def lists(n):
start_time = time.time()
lists = [[[] for y in xrange(n)] for x in xrange(n)]
print time.time() - start_time
causes the following results
>>> for i in [1000, 2000, 4000]: lists(i)
0.388785839081
4.45830011368
65.6449248791
... which is still clearly growing at a pace quicker than O(n^2) (and even quicker than O(n^3) - sheesh).
edit2: After some further looking into the problem, it seems it's caused by the garbage collector. After running gc.disable(), this is the result with the original lists() definition:
>>> for i in [1000, 2000, 4000]: lists(i)
...
0.155457019806
0.616811990738
2.38965821266
which is pretty damn neatly O(n^2).
Moral of the story: don't trust the garbage collector!
[[E for y in xrange(N)] for x in xrange(N)]