How to normalize a 2-dimensional numpy array in python less verbose?

Question

Given a 3 times 3 numpy array

a = numpy.arange(0,27,3).reshape(3,3)

# array([[ 0,  3,  6],
#        [ 9, 12, 15],
#        [18, 21, 24]])

To normalize the rows of the 2-dimensional array I thought of

row_sums = a.sum(axis=1) # array([ 9, 36, 63])
new_matrix = numpy.zeros((3,3))
for i, (row, row_sum) in enumerate(zip(a, row_sums)):
    new_matrix[i,:] = row / row_sum

There must be a better way, isn't there?

Perhaps to clearify: By normalizing I mean, the sum of the entrys per row must be one. But I think that will be clear to most people.

Careful, "normalize" usually means the square sum of components is one. Your definition will hardly be clear to most people;) — coldfix
– coldfix, Commented Jul 13, 2015 at 18:10
@coldfix speaks about L2 norm and considers it as most common (which may be true) while Aufwind uses L1 norm which is also a norm indeed. — Bálint Sass
– Bálint Sass, Commented Feb 12, 2021 at 9:50

Daniel Fischer · Accepted Answer · 2012-01-18 04:27:17Z

181

Broadcasting is really good for this:

row_sums = a.sum(axis=1)
new_matrix = a / row_sums[:, numpy.newaxis]

row_sums[:, numpy.newaxis] reshapes row_sums from being (3,) to being (3, 1). When you do a / b, a and b are broadcast against each other.

You can learn more about broadcasting here or even better here.

edited Jan 18, 2012 at 4:27

Daniel Fischer

184k19 gold badges319 silver badges436 bronze badges

answered Jan 18, 2012 at 3:21

Bi Rico

25.9k3 gold badges57 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

ali_m Over a year ago

This can be simplified even further using a.sum(axis=1, keepdims=True) to keep the singleton column dimension, which you can then broadcast along without having to use np.newaxis.

asdf Over a year ago

what if any of the row_sums is zero?

coldfix Over a year ago

This is the correct answer for the question as stated above - but if a normalization in the usual sense is desired, use np.linalg.norm instead of a.sum!

Paul Over a year ago

is this preferred to row_sums.reshape(3,1) ?

nos Over a year ago

It's not as robust since the row sum may be 0.

|

normanius · Accepted Answer · 2020-11-21 20:58:48Z

137

Scikit-learn offers a function normalize() that lets you apply various normalizations. The "make it sum to 1" is called L1-norm. Therefore:

from sklearn.preprocessing import normalize

matrix = numpy.arange(0,27,3).reshape(3,3).astype(numpy.float64)
# array([[  0.,   3.,   6.],
#        [  9.,  12.,  15.],
#        [ 18.,  21.,  24.]])

normed_matrix = normalize(matrix, axis=1, norm='l1')
# [[ 0.          0.33333333  0.66666667]
#  [ 0.25        0.33333333  0.41666667]
#  [ 0.28571429  0.33333333  0.38095238]]

Now your rows will sum to 1.

edited Nov 21, 2020 at 20:58

normanius

9,9698 gold badges64 silver badges97 bronze badges

answered Mar 20, 2014 at 22:54

rogueleaderr

4,8492 gold badges36 silver badges40 bronze badges

1 Comment

JEM_Mosig Over a year ago

This also has the advantage that it works on sparse arrays that would not fit into memory as dense arrays.

tom10 · Accepted Answer · 2012-01-18 03:22:12Z

11

I think this should work,

a = numpy.arange(0,27.,3).reshape(3,3)

a /=  a.sum(axis=1)[:,numpy.newaxis]

answered Jan 18, 2012 at 3:22

tom10

69.5k11 gold badges133 silver badges143 bronze badges

1 Comment

wim Over a year ago

good. note the change of dtype to arange, by appending decimal point to 27.

walt · Accepted Answer · 2014-05-10 20:33:37Z

6

In case you are trying to normalize each row such that its magnitude is one (i.e. a row's unit length is one or the sum of the square of each element in a row is one):

import numpy as np

a = np.arange(0,27,3).reshape(3,3)

result = a / np.linalg.norm(a, axis=-1)[:, np.newaxis]
# array([[ 0.        ,  0.4472136 ,  0.89442719],
#        [ 0.42426407,  0.56568542,  0.70710678],
#        [ 0.49153915,  0.57346234,  0.65538554]])

Verifying:

np.sum( result**2, axis=-1 )
# array([ 1.,  1.,  1.])

edited May 10, 2014 at 20:33

answered May 10, 2014 at 19:13

walt

712 silver badges4 bronze badges

2 Comments

Ztyx Over a year ago

Axis doesn't seem to be a parameter to np.linalg.norm (anymore?).

dpb Over a year ago

notably this corresponds to the l2 norm (where as rows summing to 1 corresponds to the l1 norm)

Snoopy · Accepted Answer · 2018-10-16 04:45:06Z

5

I think you can normalize the row elements sum to 1 by this: new_matrix = a / a.sum(axis=1, keepdims=1). And the column normalization can be done with new_matrix = a / a.sum(axis=0, keepdims=1). Hope this can hep.

answered Oct 16, 2018 at 4:45

Snoopy

1582 silver badges7 bronze badges

Comments

Saurabh Gupta · Accepted Answer · 2019-10-31 05:00:04Z

2

You could use built-in numpy function: np.linalg.norm(a, axis = 1, keepdims = True)

answered Oct 31, 2019 at 5:00

Saurabh Gupta

291 bronze badge

1 Comment

qwr Over a year ago

This computes the norm and does not normalize the matrix

Jamesszm · Accepted Answer · 2015-11-08 15:13:37Z

1

it appears that this also works

def normalizeRows(M):
    row_sums = M.sum(axis=1)
    return M / row_sums

answered Nov 8, 2015 at 15:13

Jamesszm

1011 silver badge10 bronze badges

Comments

Maciek · Accepted Answer · 2017-02-21 11:20:31Z

0

You could also use matrix transposition:

(a.T / row_sums).T

answered Feb 21, 2017 at 11:20

Maciek

8029 silver badges22 bronze badges

2 Comments

qwr Over a year ago

this answer is incomplete without how you computed row_sums

Maciek Over a year ago

It is in the original question: row_sums = a.sum(axis=1)

Grayrigel · Accepted Answer · 2020-11-07 21:36:11Z

0

Here is one more possible way using reshape:

a_norm = (a/a.sum(axis=1).reshape(-1,1)).round(3)
print(a_norm)

Or using None works too:

a_norm = (a/a.sum(axis=1)[:,None]).round(3)
print(a_norm)

Output:

array([[0.   , 0.333, 0.667],
       [0.25 , 0.333, 0.417],
       [0.286, 0.333, 0.381]])

answered Nov 7, 2020 at 21:36

Grayrigel

3,6045 gold badges19 silver badges36 bronze badges

Comments

Moj · Accepted Answer · 2023-01-19 23:01:52Z

0

Use

a = a / np.linalg.norm(a, ord = 2, axis = 0, keepdims = True)

Due to the broadcasting, it will work as intended.

answered Jan 19, 2023 at 23:01

Moj

3,0121 gold badge15 silver badges10 bronze badges

Comments

XY.W · Accepted Answer · 2017-01-12 09:31:39Z

-1

Or using lambda function, like

>>> vec = np.arange(0,27,3).reshape(3,3)
>>> import numpy as np
>>> norm_vec = map(lambda row: row/np.linalg.norm(row), vec)

each vector of vec will have a unit norm.

answered Jan 12, 2017 at 9:31

XY.W

1045 bronze badges

1 Comment

qwr Over a year ago

is this using python's map? won't builtin numpy functions be much faster?

kimegitee · Accepted Answer · 2021-10-13 17:41:27Z

-1

We can achieve the same effect by premultiplying with the diagonal matrix whose main diagonal is the reciprocal of the row sums.

A = np.diag(A.sum(1)**-1) @ A

answered Oct 13, 2021 at 17:41

kimegitee

111 bronze badge

2 Comments

qwr Over a year ago

too inefficient. you turned a simple sum over all elements into a big (sparse) matrix multiplication

kimegitee Over a year ago

@qwr The original poster did not ask for a more efficient version, only a less "verbose" one.

Collectives™ on Stack Overflow

How to normalize a 2-dimensional numpy array in python less verbose?

12 Answers 12

10 Comments

1 Comment

1 Comment

2 Comments

Comments

1 Comment

Comments

2 Comments

Comments

Comments

1 Comment

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

10 Comments

1 Comment

1 Comment

2 Comments

Comments

1 Comment

Comments

2 Comments

Comments

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related