Numpy: Subtract 2 numpy arrays row wise

Question

I have 2 numpy arrays a and b as below:

a = np.random.randint(0,10,(3,2))
Out[124]: 
array([[0, 2],
       [6, 8],
       [0, 4]])
b = np.random.randint(0,10,(2,2))
Out[125]: 
array([[5, 9],
       [2, 4]])

I want to subtract each row in b from each row in a and the desired output is of shape(3,2,2):

array([[[-5, -7],        [-2, -2]],

       [[ 1, -1],        [ 4,  4]],

       [[-5, -5],        [-2,  0]]])

I can do this using:

print(np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2))

But I need a fully vectorized solution or a built in numpy function to do this.

What I mean by fully vectorized solution(factorized is a typo before) is I don't want to reference array b by its index like b[i] because the number of rows in this array can change and I want to have a solution which will always output an array of shape (3,len(b),2) — Allen Qin
– Allen Qin, Commented Apr 14, 2017 at 13:20

DSM · Accepted Answer · 2017-04-14 13:22:41Z

5

Just use np.newaxis (which is just an alias for None) to add a singleton dimension to a, and let broadcasting do the rest:

In [45]: a[:, np.newaxis] - b
Out[45]: 
array([[[-5, -7],
        [-2, -2]],

       [[ 1, -1],
        [ 4,  4]],

       [[-5, -5],
        [-2,  0]]])

answered Apr 14, 2017 at 13:22

DSM

355k67 gold badges606 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

litepresence Over a year ago

this is about 40% faster than my best solution. could you better explain what is happening here? its a bit abstract

jyn Over a year ago

This is very memory inefficient for large arrays; I want to subtract a 5000 x 3078 array from a 500 x 3078 array and this would take 500 * 3072 * 5000 * 8 / 1e9 = 61.44 gigabytes.

Muhammad Yasirroni Over a year ago

This really is neat answer. We even can substract vector with list of scalar, resulting in matrix: a = np.array([1,2,3,4,5,6]), b = np.array([1,2,3]), a[:, np.newaxis]-b. Resulting in shape (6,3).

Andrei Boyanov · Accepted Answer · 2017-04-14 13:14:19Z

1

I'm not sure what means a fully factorized solution, but may be this will help:

np.append(a, a, axis=1).reshape(3, 2, 2) - b

answered Apr 14, 2017 at 13:14

Andrei Boyanov

2,39317 silver badges18 bronze badges

1 Comment

Allen Qin Over a year ago

Thanks for the answer but please see my comments in the question.

Community · Accepted Answer · 2017-05-23 10:30:55Z

1

You can shave a little time off using np.subtract(), and a good bit more using np.concatenate()

import numpy as np
import time

start = time.time()
for i in range(100000):

    a = np.random.randint(0,10,(3,2))
    b = np.random.randint(0,10,(2,2))
    c = np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2)

print time.time() - start

start = time.time()
for i in range(100000):

    a = np.random.randint(0,10,(3,2))
    b = np.random.randint(0,10,(2,2))
    #c = np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2)
    c = np.c_[np.subtract(a,b[0]),np.subtract(a,b[1])].reshape(3,2,2)

print time.time() - start

start = time.time()
for i in range(100000):

    a = np.random.randint(0,10,(3,2))
    b = np.random.randint(0,10,(2,2))
    #c = np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2)
    c = np.concatenate([np.subtract(a,b[0]),np.subtract(a,b[1])],axis=1).reshape(3,2,2)

print time.time() - start

>>>

3.14023900032
3.00368094444
1.16146492958

reference:

confused about numpy.c_ document and sample code

np.c_ is another way of doing array concatenate

edited May 23, 2017 at 10:30

CommunityBot

11 silver badge

answered Apr 14, 2017 at 13:09

litepresence

3,3571 gold badge30 silver badges36 bronze badges

2 Comments

Allen Qin Over a year ago

Thanks for the answer but please see my comments in the question.

litepresence Over a year ago

ah I see, that adds a twist. I wasn't sure what you meant by factorized in the original op, will ponder; see if I can conjure something. nonetheless reduced cpu load is always a plus

Mohamed · Accepted Answer · 2017-04-14 14:21:20Z

1

Reading from the doc on broadcasting, it says:

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when
they are equal, or
one of them is 1

Back to your case, you want result to be of shape (3, 2, 2), following these rules, you have to play around with your dimensions. Here's now the code to do it:

In [1]: a_ = np.expand_dims(a, axis=0)

In [2]: b_ = np.expand_dims(b, axis=1)

In [3]: c = a_ - b_

In [4]: c
Out[4]: 
array([[[-5, -7],
        [ 1, -1],
        [-5, -5]],

       [[-2, -2],
        [ 4,  4],
        [-2,  0]]])

In [5]: result = c.swapaxes(1, 0)

In [6]: result
Out[6]: 
array([[[-5, -7],
        [-2, -2]],

       [[ 1, -1],
        [ 4,  4]],

       [[-5, -5],
        [-2,  0]]])

In [7]: result.shape
Out[7]: (3, 2, 2)

edited Apr 14, 2017 at 14:21

answered Apr 14, 2017 at 13:37

Mohamed

4861 gold badge4 silver badges11 bronze badges

1 Comment

scribe Over a year ago

Does this answer still work if say, A and B had more than two columns?

Collectives™ on Stack Overflow

Numpy: Subtract 2 numpy arrays row wise

4 Answers 4

3 Comments

1 Comment

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

1 Comment

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related