4

I am not new to it, but I do not use Python much and my knowledge is rather broad and not very deep in the language, perhaps someone here more knowledgeable can answer my question. I find myself in the situation when I need to add items to a list and keep it sorted as items as added. A quick way of doing this would be.

list.append(item)                  // O(1)
list.sort()                        // ??

I would imagine if this is the only way items are added to the list, I would hope the sort would be rather efficient, because the list is sorted with each addition. However there is also this that works:

inserted = False
for i in range(len(list)):         // O(N)
    if (item < list[i]): 
        list.insert(i, item)       // ??
        inserted = True
        break
if not inserted: list.append(item)

Can anyone tell me if one of these is obviously more efficient? I am leaning toward the second set of statements, however I really have no idea.

5
  • wiki.python.org/moin/TimeComplexity shows that .insert() is O(N) and .sort() is O(N log N). Commented Feb 13, 2013 at 19:20
  • 1
    (Also note that your proposed second case doesn't work at all for adding an item that's less than everything already in the list.) Commented Feb 13, 2013 at 19:28
  • @Wooble, .sort() is O(N) in this case because the list is already sorted apart from the new item. Commented Feb 13, 2013 at 19:33
  • 1
    A heapq may be a better choice for this depending on your requirements Commented Feb 13, 2013 at 19:53
  • If you need to keep the data sorted while allowing for frequent inserts, a tree can be a better option than a list. Unfortunately Python doesn't come with a sorted sorted tree type. Fortunately there are several libraries available. See the links in @Princess Of the Universe's answer for more information. Commented Feb 13, 2013 at 20:40

2 Answers 2

7

What you are looking for is the bisect module and most possible the insort_left

So your expression could be equivalently written as

from

some_list.append(item)                  // O(1)
some_list.sort()                        // ??

to

bisect.insort_left(some_list, item)
Sign up to request clarification or add additional context in comments.

Comments

2

Insertion anywhere except near the end takes O(n) time, as it has to move (copy) all elements which come after the insertion point. But on the other hand, all comparison-based sorting algorithm must, on average, make Omega(n log n) comparisons. Many sorts (including timsort, which Python uses) will do significantly better on many inputs, likely including yours (the "almost sorted" case). They still have to move at least that as many elements as inserting in the right position right away though. They also have to do quite a bit of additional work (inspecting all element to ensure they're in the right order, plus more complicated logic that often improves performance, but not in your case). For these reasons, it's probably slower, at least for large lists.

Due to being written in C (in CPython; but similar reasoning applies for other Pythons), it may still be faster than your Python-written linear scan. That leaves the question of how to find the insertion point. Binary search can do this part in O(log n) time, so it's quite useful here (of course, insertion is still O(n), but there's no way around this if you want a sorted list). Unfortunately, binary search is rather tricky to implement. Fortunately, it's already implemented in the standard library: bisect.

3 Comments

` sorting takes up O(n log n) time.`, you mean best case complexity
@Abhijit I don't think that's the right term (for many sorting algorithms, the best case is an already-sorted input and they handle that in O(n) time). But you're right, my wording is not optimal. I'll see how I can improve that part.
@Abhijit: timsort is O(N log N) in the average and worst case, O(N) in the best case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.