Divide numpy array Python

Question

I have this piece of code:

n = np.load(matrix)["arr_0"]
shape = n.shape
##colsums and rowsums
rows = {}
cols = {}
for i in xrange(shape[0]): #row
    rows[i] = np.sum(n[i,:])
for j in xrange(shape[1]): #cols
    cols[j] = np.sum(n[:,j])
##looping over bins
for i in xrange(shape[0]): #row
    print i
    for j in xrange(shape[1]): #column
        if rows[i] == 0 or cols[j] == 0:
            continue
        n[i,j] = n[i,j]/math.sqrt(rows[i]*cols[j])

It basically loops over a numpy matrix with shape (50000,50000) and I need to divide each value for the square root of the product of the sum of the corresponding column by the sum of the corresponding row. My implementation takes ages. Do you have any suggestions to improve its performance?

There is usually no need for explicitly using loops when using np.array. If you are then you are probably over-complicating something. — DeepSpace
– DeepSpace, Commented Jul 4, 2016 at 12:51

Benjamin · Accepted Answer · 2016-07-04 12:57:42Z

2

You can simply take the sums individually on each axis, then take the outer product, then the square root. This can be condensed a bit but it gives you an idea how to vectorize it.

# Sum of rows and columns
a = numpy.sum(data, axis=1)
b = numpy.sum(data, axis=0)

# Product of sum and columns
c = numpy.outer(a,b)

# The square root...
d = numpy.sqrt(c)

# ...a nd the division
data /= d

edited Jul 4, 2016 at 12:57

answered Jul 4, 2016 at 12:49

Benjamin

12k13 gold badges75 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user2979409 Over a year ago

TypeError: ufunc 'divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'h') according to the casting rule ''same_kind''

Benjamin Over a year ago

There is a type difference between data and d, you can just do data = data/d instead, or find/fix the type difference.

user2979409 Over a year ago

I think that it throws the error since I have rows and columns which sum is 0. So, invalid division.

Divakar · Accepted Answer · 2016-07-04 12:56:27Z

1

Here's a one-liner solution using np.where and NumPy broadcasting -

np.where((rows[:,None]==0) | (cols==0),n,n/np.sqrt((rows[:,None]*cols)))

answered Jul 4, 2016 at 12:56

Divakar

222k19 gold badges273 silver badges374 bronze badges

Collectives™ on Stack Overflow

Divide numpy array Python

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related