1

I wonder if there are very simple ways to calculate pairwise subtraction for two elements in a multi-dimensional array which is consisted of vectors USING a function in NUMPY or SCIPY library.

Let me introduce an example:

>>> a = (np.arange(9)).reshape((3,3)) # a list of 3 vectors (x, y, z)

>>> a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

I want to get the following:

>>>> result
array([[3,3,3],
       [6,6,6],
       [3,3,3]])
# first element comes from [3,4,5] - [0,1,2]
# second element comes from [6,7,8] - [0,1,2]
# third element comes from [6,7,8] - [3,4,5]

I do not matter about the signs (+/-) on the result which depends on the order of subtraction of two vectors. But, I want to know VERY SIMPLE version of code using pre-defined functions in Scipy or Numpy libraries such as scipy.spatial.distance.pdist.

I do need loop codes to iterate elements-wise for result, instead, I need just one line to get the result.

2 Answers 2

3

Approach #1

Get the pairwise indices with np.triu_indices, index into rows of a with those and compute the differences -

In [8]: r,c = np.triu_indices(len(a),1)

In [9]: a[c] - a[r]
Out[9]: 
array([[3, 3, 3],
       [6, 6, 6],
       [3, 3, 3]])

Approach #2

We can also use slicing that avoids creating the indices and the indexing part itself that creates copies of the input array for the slices needed for subtraction. Thus, we would work with views only, but we need to iterate in that process. The advantage of slicing shows off for large arrays on performance, as we would verify later on in timings. The implementation would be -

n = len(a)
N = n*(n-1)//2
idx = np.concatenate(( [0], np.arange(n-1,0,-1).cumsum() ))
start, stop = idx[:-1], idx[1:]
out = np.empty((N,a.shape[1]),dtype=a.dtype)
for j,i in enumerate(range(n-1)):
    out[start[j]:stop[j]] = a[i+1:] - a[i,None]

Runtime test

Approaches as funcs -

def pairwise_row_diff_triu_indices(a):
    r,c = np.triu_indices(len(a),1)
    out = a[c] - a[r]
    return out

def pairwise_row_diff_slicing(a):
    n = len(a)
    N = n*(n-1)//2
    idx = np.concatenate(( [0], np.arange(n-1,0,-1).cumsum() ))
    start, stop = idx[:-1], idx[1:]
    out = np.empty((N,a.shape[1]),dtype=a.dtype)
    for j,i in enumerate(range(n-1)):
        out[start[j]:stop[j]] = a[i+1:] - a[i,None]
    return out

Timings -

In [53]: np.random.seed(0)

In [54]: a = np.random.randint(0,9,(1000,3))

In [55]: %timeit pairwise_row_diff_triu_indices(a)
    ...: %timeit pairwise_row_diff_slicing(a)
10 loops, best of 3: 21 ms per loop
100 loops, best of 3: 6.01 ms per loop

In [56]: a = np.random.randint(0,9,(5000,3))

In [57]: %timeit pairwise_row_diff_triu_indices(a)
    ...: %timeit pairwise_row_diff_slicing(a)
1 loop, best of 3: 456 ms per loop
10 loops, best of 3: 110 ms per loop
Sign up to request clarification or add additional context in comments.

1 Comment

I was surprised and I am glad to see such a creative approach. That's what answer I wanted
0

This is not using numpy or scipy functions...but it is a simple solution

length = a.shape[1]
new_arr = []
for ii in range(length):
    for jj in range(ii+1,length):
        new_arr.append(a[ii,]-a[jj,])

1 Comment

Thanks for your solution. Yes, it is simple, but I wanted to think in different ways.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.