Python 2.6 optimize nested loops

Question

I have a function that has a dictionary as input and a value n. Each item in the dictionary is a set with one or more values. The function should sort the dictionary keys and it should extract and return "n"values. This function will be executed very often therefore I am trying to optimize it. Any suggestions?

def select_items(temp_dict, n):
  """Select n items from the dictionary"""
  res = []
  sort_keys = sorted(temp_dict.keys())
  count = 0

  for key in sort_keys:
    for pair in temp_dict[key]:
      if count < n:
        res.append(pair)
        count += 1
      else:
        return res

  return res

In this code I have a count and "if statement" to control the number of selected values. Is there a way to optimize this code by using some function in itertools or something else?

This smells like premature optimization. Did you profile your code to verify that this piece of code is actually a bottleneck? If no, you probably guessed wrong. — Niklas B.
– Niklas B., Commented Jan 2, 2012 at 19:39

David Wolever · Accepted Answer · 2012-01-03 04:43:46Z

6

Here's my first attempt (see select_items_faster), which almost doubles the speed:

In [12]: print _11
import itertools

def select_items_original(temp_dict, n):
  """Select n items from the dictionary"""
  res = []
  sort_keys = sorted(temp_dict.keys())
  count = 0

  for key in sort_keys:
    for pair in temp_dict[key]:
      if count < n:
        res.append(pair)
        count += 1
      else:
        return res

  return res

def select_items_faster(temp_dict, n):
    """Select n items from the dictionary"""
    items = temp_dict.items()
    items.sort()

    return list(itertools.chain.from_iterable(val for (_, val) in itertools.islice(items, n)))

test_dict = dict((x, ["a"] * int(x / 500)) for x in range(1000))
test_n = 300


In [13]: %timeit select_items_original(test_dict, test_n)
1000 loops, best of 3: 293 us per loop


In [14]: %timeit select_items_faster(test_dict, test_n)
1000 loops, best of 3: 203 us per loop

Replacing the itertools.islice with a [:n] doesn't really help things:

def select_items_faster_slice(temp_dict, n):
    """Select n items from the dictionary"""
    items = temp_dict.items()
    items.sort()

    return list(itertools.chain.from_iterable(val for (_, val) in items[:n]))

In [16]: %timeit select_items_faster_slice(test_dict, test_n)
1000 loops, best of 3: 210 us per loop

And neither does sorted:

In [18]: %timeit select_items_faster_sorted(test_dict, test_n)
1000 loops, best of 3: 213 us per loop


In [19]: print _17
def select_items_faster_sorted(temp_dict, n):
    """Select n items from the dictionary"""
    return list(itertools.chain.from_iterable(val for (_, val) in itertools.islice(sorted(temp_dict.items()), n)))

But a combination of map and __getitem__ is much faster:

In [22]: %timeit select_items_faster_map_getitem(test_dict, test_n)
10000 loops, best of 3: 90.7 us per loop

In [23]: print _20
def select_items_faster_map_getitem(temp_dict, n):
    """Select n items from the dictionary"""
    keys = temp_dict.keys()
    keys.sort()
    return list(itertools.chain.from_iterable(map(temp_dict.__getitem__, keys[:n])))

Replacing the list(itertools.chain.from_iterable) with some magic speeds things up quite a bit:

In [28]: %timeit select_items_faster_map_getitem_list_extend(test_dict, test_n)
10000 loops, best of 3: 74.9 us per loop

In 29: print _27
def select_items_faster_map_getitem_list_extend(temp_dict, n):
    """Select n items from the dictionary"""
    keys = temp_dict.keys()
    keys.sort()
    result = []
    filter(result.extend, map(temp_dict.__getitem__, keys[:n]))
    return result

And replacing the map and slice with itertools functions squeeze out a tiny bit more speed:

In [31]: %timeit select_items_faster_map_getitem_list_extend_iterables(test_dict, test_n)
10000 loops, best of 3: 72.8 us per loop

In [32]: print _30
def select_items_faster_map_getitem_list_extend_iterables(temp_dict, n):
    """Select n items from the dictionary"""
    keys = temp_dict.keys()
    keys.sort()
    result = []
    filter(result.extend, itertools.imap(temp_dict.__getitem__, itertools.islice(keys, n)))
    return result

And that is about as fast as I think it can get, because in CPython Python function calls are rather slow, and this minimizes the number of Python function calls that are made in the inner loop.

Note:

Since the OP didn't provide any hint at what the input data look like, so I had to guess. I could be way off, and this could drastically change the meaning of "fast".
Every one of my implementations returns n - 1 items, not n.

Edit: Using the same method to profile J.F. Sebastian's code:

In [2]: %timeit select_items_heapq(test_dict, test_n)
1000 loops, best of 3: 572 us per loop

In [3]: print _1
from itertools import *
import heapq

def select_items_heapq(temp_dict, n):
    return list(islice(chain.from_iterable(imap(temp_dict.get, heapq.nsmallest(n, temp_dict))),n))

And TokenMacGuy's code:

In [5]: %timeit select_items_tokenmacguy_first(test_dict, test_n)
1000 loops, best of 3: 201 us per loop

In [6]: %timeit select_items_tokenmacguy_second(test_dict, test_n)
1000 loops, best of 3: 730 us per loop

In [7]: print _4
def select_items_tokenmacguy_first(m, n):
    k, v, r = m.keys(), m.values(), range(len(m))
    r.sort(key=k.__getitem__)
    return [v[i] for i in r[:n]]

import heapq
def select_items_tokenmacguy_second(m, n):
    k, v, r = m.keys(), m.values(), range(len(m))
    smallest = heapq.nsmallest(n, r, k.__getitem__)
    for i, ind in enumerate(smallest):
        smallest[i] = v[ind]
    return smallest

edited Jan 3, 2012 at 4:43

answered Jan 2, 2012 at 19:54

David Wolever

156k94 gold badges365 silver badges513 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

andreSmol Over a year ago

You have sorted the items but I would like to have the keys sorted as the main criteria!

Niklas B. Over a year ago

@user963386: He sorts by key.

Niklas B. Over a year ago

Also, if you just want to iterate over the result and don't need it as a list, you can reutrn a generator instead of a temporary list.

David Wolever Over a year ago

Because dict.items() returns tuples of (key, value) when that list is sorted, sorting the .items() list will be the same as sorting by the key (except when keys are identical...).

jfs Over a year ago

The OP's code: for pair in temp_dict[key]: res.append(pair) therefore each value must be an iterable. (it is equivalent to res.extend(temp_dict[key]). Therefore most of the above examples are wrong.

|

Niklas B. · Accepted Answer · 2012-01-02 20:34:11Z

2

from itertools import *
import heapq
islice(chain.from_iterable(imap(temp_dict.get, heapq.nsmallest(n, temp_dict))),n)

edited Jan 2, 2012 at 20:34

Niklas B.

95.8k18 gold badges201 silver badges228 bronze badges

answered Jan 2, 2012 at 20:07

jfs

417k210 gold badges1k silver badges1.7k bronze badges

7 Comments

Niklas B. Over a year ago

Clever use of the standard library, but would be even nicer with some explanation.

ovgolovin Over a year ago

@J.F.Sebastian Hmmm :) But outhers, who are not familiar with itertools and heapq may be utterly frustrated. I think this modules deserve to be mentioned in the answer.

David Wolever Over a year ago

Fancy, but my profiling shows it to be three times slower than the original (see my edited post).

ovgolovin Over a year ago

@NiklasBaumstark Just a question that may come in handy in the future. Is it OK to add (for me) some remarks to the answers of the guys like Sebastian? He has 35k rep! I think he must understand what he is doing. But on the other hand to my mind this answer may be so confusing for some readers.

Eduardo Ivanec Over a year ago

A note regarding the performance of this snippet: nsmallest becomes slower as n gets bigger because of an inherent quality of heaps. The docs say:

(...) (they) perform best for smaller values of n. For larger values, it is more efficient to use the sorted() function. Also, when n==1, it is more efficient to use the built-in min() and max() functions.

.

|

Eduardo Ivanec · Accepted Answer · 2012-01-02 21:36:34Z

2

Using a list comprehension and returning a generator is a cleaner/more readable alternative in my opinion. Using an array slice avoids the if clause.

def select_items(dic, n):
  return (dic[key] for key in sorted(dic.keys())[:n])

On speed: I think the actual sort call is probably the biggest bottleneck here, although you probably shouldn't worry about it until you hit a big size for the dictionary. In that case you should probably look into keeping the dictionary ordered in the first place - you pay a complexity price on insertion, but lookups/selections are fast. An example is sorteddict. A tree-based data structure may be another alternative.

On to the benchmarks. Initial setup, lifted from David Wolever's nice answer:

test_dict = dict((x, "a") for x in range(1000))
test_n = 300

Your version:

%timeit select_items(test_dict, test_n)
1000 loops, best of 3: 334 us per loop

This version:

%timeit select_items(test_dict, test_n)
10000 loops, best of 3: 49.1 us per loop

edited Jan 2, 2012 at 21:36

answered Jan 2, 2012 at 19:45

Eduardo Ivanec

11.9k2 gold badges41 silver badges42 bronze badges

5 Comments

David Wolever Over a year ago

Might be cleaner, but OP didn't ask about clean — OP asked about faster. Can you show that this is faster?

Eduardo Ivanec Over a year ago

@David: thanks for your comment. I can't run a benchmark now but I added a recommendation that's strictly speed related.

David Wolever Over a year ago

Why not? Check out ipython's awesome %timeit magic function (I use it in my answer)

Eduardo Ivanec Over a year ago

@DavidWolever: it is great! I was just leaving an hour ago, but I've just updated the answer with my benchmark.

David Wolever Over a year ago

In your benchmark, please also include the input data, and whatever else I'll need to recreate it (ex, that's why I print the function after each test, as well as the input data, so if anyone has beef with my results they can run the tests themselves).

ekhumoro · Accepted Answer · 2012-01-02 23:37:01Z

The answers given so far don't follow the user's spec.

The data is a dictionary of sequences, and the desired result is a list of the first n elements of the dictionary values taken in sorted order by key.

So if the data is:

{1: [1, 2, 3], 2: [4, 5, 6]}

then, if n = 5, the result should be:

[1, 2, 3, 4, 5]

Given that, here's a script which compares the original function with a (slightly) optimised new version:

from timeit import timeit

def select_items_old(temp_dict, n):
  res = []
  sort_keys = sorted(temp_dict.keys())
  count = 0
  for key in sort_keys:
    for pair in temp_dict[key]:
      if count < n:
        res.append(pair)
        count += 1
      else:
        return res
  return res

def select_items_new(data, limit):
    count = 0
    result = []
    extend = result.extend
    for key in sorted(data.keys()):
        value = data[key]
        extend(value)
        count += len(value)
        if count >= limit:
            break
    return result[:limit]

data = {x:range(10) for x in range(1000)}

def compare(*args):
    number = 1000
    for func in args:
        name = func.__name__
        print ('test: %s(data, 12): %r' % (name, func(data, 12)))
        code = '%s(data, %d)' % (name, 300)
        duration = timeit(
            code, 'from __main__ import %s, data' % name, number=number)
        print ('time: %s: %.2f usec/pass\n' % (code, 1000000 * duration/number))

compare(select_items_old, select_items_new)

Output:

test: select_items_old(data, 12): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
time: select_items_old(data, 300): 163.81 usec/pass

test: select_items_new(data, 12): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
time: select_items_new(data, 300): 67.74 usec/pass

Collectives™ on Stack Overflow

Python 2.6 optimize nested loops

4 Answers 4

10 Comments

7 Comments

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

10 Comments

7 Comments

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related