2

We are given a numpy array (ndarray) of lists (dtype=object), and would like to return a similar array of lists where each list is sorted. Is there an efficient way of doing this (i.e. without for loop etc)?

Please don't offer np.vectorize() as a solution as this is implemented as a for loop and is thus inefficient.

For example:

a=np.array([[5,4],[6,7,2],[8,1,9]],dtype=object)

so a is:

array([list([5, 4]), list([6, 7, 2]), list([8, 1, 9])], dtype=object)

and we would like the function to sort it so we would get:

array([list([4, 5]), list([2, 6, 7]), list([1, 8, 9])], dtype=object)
10
  • So using a list comprehension for example is too slow? Commented Dec 8, 2019 at 12:18
  • 2
    Numpy's speed comes from homogeneous rectangular arrays, not magic. Numpy won't be faster for your problem, unless np.sort is faster than timsort for your individual lists. Commented Dec 8, 2019 at 12:21
  • @AndrasDeak I disagree, the speed comes also comes from vectorizing by pushing loops into the c, and that does not necessarily require rectangular arrays. Commented Dec 8, 2019 at 12:23
  • 1
    Python's sort is also written in C. A lot of numpy speedup comes from caching in the CPU which is gone if your array only contains pointers to data far apart. Commented Dec 8, 2019 at 12:25
  • 1
    frompyfunc often is twice as fast as iteration when working with the elements of a object array. Commented Dec 8, 2019 at 16:16

2 Answers 2

4

Your example, and an expanded version for time tests:

In [202]: a=np.array([[5,4],[6,7,2],[8,1,9]],dtype=object)                      
In [203]: A = a.repeat(100)                                                     

Applying Python list sort to each element:

In [204]: np.array([sorted(i) for i in a])                                      
Out[204]: array([list([4, 5]), list([2, 6, 7]), list([1, 8, 9])], dtype=object)

Using frompyfunc to do the same:

In [205]: np.frompyfunc(sorted,1,1)(a)                                          
Out[205]: array([list([4, 5]), list([2, 6, 7]), list([1, 8, 9])], dtype=object)

Some timings:

In [206]: timeit np.array(list(map(sorted, A)))                                 
168 µs ± 221 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [207]: timeit np.array([sorted(i) for i in A])                               
181 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

map is a little faster than a list comprehension. I prefer the readability of the comprehension.

A pure list version is quite a bit faster:

In [208]: %%timeit temp=A.tolist() 
     ...: list(map(sorted, temp)) 
     ...:  
     ...:                                                                       
88.3 µs ± 70.8 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

frompyfunc is faster than the array map, and almost as good as the pure list version:

In [209]: timeit np.frompyfunc(sorted,1,1)(A)                                   
97.3 µs ± 1.93 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

That's the pattern I've seen before. frompyfunc is the fastest way to apply functions to elements of an object dtype array, but it seldom is better than a list based iteration.

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks, very interesting. So how much of the difference between np.array(list(map(sorted, A))) and the pure python version comes from the need to convert the list back to a numpy ndarray at the end?
Most of the difference lies in that conversion back to array. Starting with a list helps some, but not nearly as much.
Great, thanks. Unforunately I need this to end up as a numpy array (for other fast operations), so I guess there is no getting around this.
@Bitwise heads-up: those "other fast operations" might also be faster in native python.
@AndrasDeak thanks, in a general case that might be true, but unfortunately that is not the case for me.
|
1
np.array(list(map(sorted, a)))

gives:

array([list([4, 5]), list([2, 6, 7]), list([1, 8, 9])], dtype=object)

1 Comment

Good solution, but not numpy.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.