Iterative subtraction of elements in array in Python

Question

I have a large numpy array. Is there a way to subtract each element with the elements below it, and store the result in a new list/array, without using a loop.

A simple example of what I mean:

a = numpy.array([4,3,2,1])

result = [4-3, 4-2, 4-1, 3-2, 3-1, 2-1] = [1, 2, 3, 1, 2 ,1]

Note that the 'real' array I am working with doesn't contain numbers in sequence. This is just to make the example simple.

I know the result should have (n-1)! elements, where n is the size of the array.

Is there a way to do this without using a loop, but by repeating the array in a 'smart' way?

Thanks!

Why (n-1)! elements? It looks to me like it should be n choose 2. — user2357112
– user2357112, Commented Mar 24, 2017 at 20:29

user2357112 · Accepted Answer · 2017-03-24 22:23:26Z

3

temp = a[:, None] - a
result = temp[np.triu_indices(len(a), k=1)]

Perform all pairwise subtractions to produce temp, including subtracting elements from themselves and subtracting earlier elements from later elements, then use triu_indices to select the results we want. (a[:, None] adds an extra length-1 axis to a.)

Note that almost all of the runtime is spent constructing result from temp (because triu_indices is slow and using indices to select the upper triangle of an array is slow). If you can use temp directly, you can save a lot of time:

In [13]: a = numpy.arange(2000)

In [14]: %%timeit
   ....: temp = a[:, None] - a
   ....: 
100 loops, best of 3: 6.99 ms per loop

In [15]: %%timeit
   ....: temp = a[:, None] - a
   ....: result = temp[numpy.triu_indices(len(a), k=1)]
   ....: 
10 loops, best of 3: 51.7 ms per loop

edited Mar 24, 2017 at 22:23

answered Mar 24, 2017 at 20:39

user2357112

286k32 gold badges490 silver badges570 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Some Guy Over a year ago

Elegant ! Can you give a reference for this a[:,None] indexing you are using.

user2357112 Over a year ago

@SomeGuy: Added.

Divakar · Accepted Answer · 2017-03-24 22:16:07Z

2

Here's a masking based approach for the extraction after broadcasted subtractions and for the mask creation we are again making use of broadcasting (double broadcasting powered so to speak) -

r = np.arange(a.size)
out = (a[:, None] - a)[r[:,None] < r]

Runtime test

Vectorized approaches -

# @user2357112's solution
def pairwise_diff_triu_indices_based(a):
    return (a[:, None] - a)[np.triu_indices(len(a), k=1)]

# Proposed in this post
def pairwise_diff_masking_based(a):
    r = np.arange(a.size)
    return (a[:, None] - a)[r[:,None] < r]

Timings -

In [109]: a = np.arange(2000)

In [110]: %timeit pairwise_diff_triu_indices_based(a)
10 loops, best of 3: 36.1 ms per loop

In [111]: %timeit pairwise_diff_masking_based(a)
100 loops, best of 3: 11.8 ms per loop

Closer look at involved performance parameters

Let's dig deep a bit through the timings on this setup to study how much mask based approach helps. Now, for comparison there are two parts - Mask creation vs. indices creation and Mask based boolean indexing vs. integer based indexing.

How much mask creation helps?

In [37]: r = np.arange(a.size)

In [38]: %timeit np.arange(a.size)
1000000 loops, best of 3: 1.88 µs per loop

In [39]: %timeit r[:,None] < r
100 loops, best of 3: 3 ms per loop

In [40]: %timeit np.triu_indices(len(a), k=1)
100 loops, best of 3: 14.7 ms per loop

About 5x improvement on mask creation over index setup.

How much boolean indexing helps against integer based indexing?

In [41]: mask = r[:,None] < r

In [42]: idx = np.triu_indices(len(a), k=1)

In [43]: subs = a[:, None] - a

In [44]: %timeit subs[mask]
100 loops, best of 3: 4.15 ms per loop

In [45]: %timeit subs[idx]
100 loops, best of 3: 10.9 ms per loop

About 2.5x improvement here.

edited Mar 24, 2017 at 22:16

answered Mar 24, 2017 at 21:41

Divakar

222k19 gold badges273 silver badges374 bronze badges

2 Comments

user2357112 Over a year ago

That's a much faster way to extract the upper triangle.

Divakar Over a year ago

@user2357112 Yup! Added a bit of closer look on it.

Kenan Banks · Accepted Answer · 2017-03-24 20:45:21Z

1

a = [4, 3, 2, 1]
differences = ((x - y) for i, x in enumerate(a) for y in a[i+1:])
for diff in differences:
  # do something with difference.
  pass

edited Mar 24, 2017 at 20:45

answered Mar 24, 2017 at 20:32

Kenan Banks

213k36 gold badges160 silver badges176 bronze badges

4 Comments

Emmanuel Sunil Over a year ago

yes, but list comprehension uses loops! My array is too large!

user2357112 Over a year ago

In NumPy, "without using a loop" includes list comprehensions and other constructs that perform iteration in Python. The goal is to get all the iteration into C loops over unboxed data.

Emmanuel Sunil Over a year ago

Is there a way to do this by repeating the array in a smart way so that in the end all you need to do is subtract 1 array from another?

Kenan Banks Over a year ago

Well, I changed it to a generator comprehension which ought to handle enormous lists. You can iterate over that and pull results one at a time. This is definitely how I'd write it, but I'm no numpy expert.

Julien · Accepted Answer · 2017-03-24 20:35:57Z

0

Check out itertools.combinations:

from itertools import combinations

l = [4, 3, 2, 1]

result = []
for n1, n2 in combinations(l, 2):
    result.append(n1 - n2)

print result

Results in:

[1, 2, 3, 1, 2, 1]

combinations returns a generator, so this is good for very large lists :)

answered Mar 24, 2017 at 20:35

Julien

5,8295 gold badges36 silver badges36 bronze badges

Collectives™ on Stack Overflow

Iterative subtraction of elements in array in Python

4 Answers 4

2 Comments

2 Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

2 Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related