2

I have a function that has a dictionary as input and a value n. Each item in the dictionary is a set with one or more values. The function should sort the dictionary keys and it should extract and return "n"values. This function will be executed very often therefore I am trying to optimize it. Any suggestions?

def select_items(temp_dict, n):
  """Select n items from the dictionary"""
  res = []
  sort_keys = sorted(temp_dict.keys())
  count = 0

  for key in sort_keys:
    for pair in temp_dict[key]:
      if count < n:
        res.append(pair)
        count += 1
      else:
        return res

  return res

In this code I have a count and "if statement" to control the number of selected values. Is there a way to optimize this code by using some function in itertools or something else?

1
  • 3
    This smells like premature optimization. Did you profile your code to verify that this piece of code is actually a bottleneck? If no, you probably guessed wrong. Commented Jan 2, 2012 at 19:39

4 Answers 4

6

Here's my first attempt (see select_items_faster), which almost doubles the speed:

In [12]: print _11
import itertools

def select_items_original(temp_dict, n):
  """Select n items from the dictionary"""
  res = []
  sort_keys = sorted(temp_dict.keys())
  count = 0

  for key in sort_keys:
    for pair in temp_dict[key]:
      if count < n:
        res.append(pair)
        count += 1
      else:
        return res

  return res

def select_items_faster(temp_dict, n):
    """Select n items from the dictionary"""
    items = temp_dict.items()
    items.sort()

    return list(itertools.chain.from_iterable(val for (_, val) in itertools.islice(items, n)))

test_dict = dict((x, ["a"] * int(x / 500)) for x in range(1000))
test_n = 300


In [13]: %timeit select_items_original(test_dict, test_n)
1000 loops, best of 3: 293 us per loop


In [14]: %timeit select_items_faster(test_dict, test_n)
1000 loops, best of 3: 203 us per loop

Replacing the itertools.islice with a [:n] doesn't really help things:

def select_items_faster_slice(temp_dict, n):
    """Select n items from the dictionary"""
    items = temp_dict.items()
    items.sort()

    return list(itertools.chain.from_iterable(val for (_, val) in items[:n]))

In [16]: %timeit select_items_faster_slice(test_dict, test_n)
1000 loops, best of 3: 210 us per loop

And neither does sorted:

In [18]: %timeit select_items_faster_sorted(test_dict, test_n)
1000 loops, best of 3: 213 us per loop


In [19]: print _17
def select_items_faster_sorted(temp_dict, n):
    """Select n items from the dictionary"""
    return list(itertools.chain.from_iterable(val for (_, val) in itertools.islice(sorted(temp_dict.items()), n)))

But a combination of map and __getitem__ is much faster:

In [22]: %timeit select_items_faster_map_getitem(test_dict, test_n)
10000 loops, best of 3: 90.7 us per loop

In [23]: print _20
def select_items_faster_map_getitem(temp_dict, n):
    """Select n items from the dictionary"""
    keys = temp_dict.keys()
    keys.sort()
    return list(itertools.chain.from_iterable(map(temp_dict.__getitem__, keys[:n])))

Replacing the list(itertools.chain.from_iterable) with some magic speeds things up quite a bit:

In [28]: %timeit select_items_faster_map_getitem_list_extend(test_dict, test_n)
10000 loops, best of 3: 74.9 us per loop

In 29: print _27
def select_items_faster_map_getitem_list_extend(temp_dict, n):
    """Select n items from the dictionary"""
    keys = temp_dict.keys()
    keys.sort()
    result = []
    filter(result.extend, map(temp_dict.__getitem__, keys[:n]))
    return result

And replacing the map and slice with itertools functions squeeze out a tiny bit more speed:

In [31]: %timeit select_items_faster_map_getitem_list_extend_iterables(test_dict, test_n)
10000 loops, best of 3: 72.8 us per loop

In [32]: print _30
def select_items_faster_map_getitem_list_extend_iterables(temp_dict, n):
    """Select n items from the dictionary"""
    keys = temp_dict.keys()
    keys.sort()
    result = []
    filter(result.extend, itertools.imap(temp_dict.__getitem__, itertools.islice(keys, n)))
    return result

And that is about as fast as I think it can get, because in CPython Python function calls are rather slow, and this minimizes the number of Python function calls that are made in the inner loop.

Note:

  • Since the OP didn't provide any hint at what the input data look like, so I had to guess. I could be way off, and this could drastically change the meaning of "fast".
  • Every one of my implementations returns n - 1 items, not n.

Edit: Using the same method to profile J.F. Sebastian's code:

In [2]: %timeit select_items_heapq(test_dict, test_n)
1000 loops, best of 3: 572 us per loop

In [3]: print _1
from itertools import *
import heapq

def select_items_heapq(temp_dict, n):
    return list(islice(chain.from_iterable(imap(temp_dict.get, heapq.nsmallest(n, temp_dict))),n))

And TokenMacGuy's code:

In [5]: %timeit select_items_tokenmacguy_first(test_dict, test_n)
1000 loops, best of 3: 201 us per loop

In [6]: %timeit select_items_tokenmacguy_second(test_dict, test_n)
1000 loops, best of 3: 730 us per loop

In [7]: print _4
def select_items_tokenmacguy_first(m, n):
    k, v, r = m.keys(), m.values(), range(len(m))
    r.sort(key=k.__getitem__)
    return [v[i] for i in r[:n]]

import heapq
def select_items_tokenmacguy_second(m, n):
    k, v, r = m.keys(), m.values(), range(len(m))
    smallest = heapq.nsmallest(n, r, k.__getitem__)
    for i, ind in enumerate(smallest):
        smallest[i] = v[ind]
    return smallest
Sign up to request clarification or add additional context in comments.

10 Comments

You have sorted the items but I would like to have the keys sorted as the main criteria!
@user963386: He sorts by key.
Also, if you just want to iterate over the result and don't need it as a list, you can reutrn a generator instead of a temporary list.
Because dict.items() returns tuples of (key, value) when that list is sorted, sorting the .items() list will be the same as sorting by the key (except when keys are identical...).
The OP's code: for pair in temp_dict[key]: res.append(pair) therefore each value must be an iterable. (it is equivalent to res.extend(temp_dict[key]). Therefore most of the above examples are wrong.
|
2
from itertools import *
import heapq
islice(chain.from_iterable(imap(temp_dict.get, heapq.nsmallest(n, temp_dict))),n)

7 Comments

Clever use of the standard library, but would be even nicer with some explanation.
@J.F.Sebastian Hmmm :) But outhers, who are not familiar with itertools and heapq may be utterly frustrated. I think this modules deserve to be mentioned in the answer.
Fancy, but my profiling shows it to be three times slower than the original (see my edited post).
@NiklasBaumstark Just a question that may come in handy in the future. Is it OK to add (for me) some remarks to the answers of the guys like Sebastian? He has 35k rep! I think he must understand what he is doing. But on the other hand to my mind this answer may be so confusing for some readers.
A note regarding the performance of this snippet: nsmallest becomes slower as n gets bigger because of an inherent quality of heaps. The docs say: (...) (they) perform best for smaller values of n. For larger values, it is more efficient to use the sorted() function. Also, when n==1, it is more efficient to use the built-in min() and max() functions..
|
2

Using a list comprehension and returning a generator is a cleaner/more readable alternative in my opinion. Using an array slice avoids the if clause.

def select_items(dic, n):
  return (dic[key] for key in sorted(dic.keys())[:n])

On speed: I think the actual sort call is probably the biggest bottleneck here, although you probably shouldn't worry about it until you hit a big size for the dictionary. In that case you should probably look into keeping the dictionary ordered in the first place - you pay a complexity price on insertion, but lookups/selections are fast. An example is sorteddict. A tree-based data structure may be another alternative.

On to the benchmarks. Initial setup, lifted from David Wolever's nice answer:

test_dict = dict((x, "a") for x in range(1000))
test_n = 300

Your version:

%timeit select_items(test_dict, test_n)
1000 loops, best of 3: 334 us per loop

This version:

%timeit select_items(test_dict, test_n)
10000 loops, best of 3: 49.1 us per loop

5 Comments

Might be cleaner, but OP didn't ask about clean — OP asked about faster. Can you show that this is faster?
@David: thanks for your comment. I can't run a benchmark now but I added a recommendation that's strictly speed related.
Why not? Check out ipython's awesome %timeit magic function (I use it in my answer)
@DavidWolever: it is great! I was just leaving an hour ago, but I've just updated the answer with my benchmark.
In your benchmark, please also include the input data, and whatever else I'll need to recreate it (ex, that's why I print the function after each test, as well as the input data, so if anyone has beef with my results they can run the tests themselves).
1

The answers given so far don't follow the user's spec.

The data is a dictionary of sequences, and the desired result is a list of the first n elements of the dictionary values taken in sorted order by key.

So if the data is:

{1: [1, 2, 3], 2: [4, 5, 6]}

then, if n = 5, the result should be:

[1, 2, 3, 4, 5]

Given that, here's a script which compares the original function with a (slightly) optimised new version:

from timeit import timeit

def select_items_old(temp_dict, n):
  res = []
  sort_keys = sorted(temp_dict.keys())
  count = 0
  for key in sort_keys:
    for pair in temp_dict[key]:
      if count < n:
        res.append(pair)
        count += 1
      else:
        return res
  return res

def select_items_new(data, limit):
    count = 0
    result = []
    extend = result.extend
    for key in sorted(data.keys()):
        value = data[key]
        extend(value)
        count += len(value)
        if count >= limit:
            break
    return result[:limit]

data = {x:range(10) for x in range(1000)}

def compare(*args):
    number = 1000
    for func in args:
        name = func.__name__
        print ('test: %s(data, 12): %r' % (name, func(data, 12)))
        code = '%s(data, %d)' % (name, 300)
        duration = timeit(
            code, 'from __main__ import %s, data' % name, number=number)
        print ('time: %s: %.2f usec/pass\n' % (code, 1000000 * duration/number))

compare(select_items_old, select_items_new)

Output:

test: select_items_old(data, 12): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
time: select_items_old(data, 300): 163.81 usec/pass

test: select_items_new(data, 12): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
time: select_items_new(data, 300): 67.74 usec/pass

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.