Random matrix with sum of values by column = 1 in python

Question

Create a matrix like a transtision matrix How i can create random matrix with sum of values by column = 1 in python ?

Does this answer your question? Random Binary Matrix where Rows and Columns Sum to 1 using Numpy — Hamid Rasti
– Hamid Rasti, Commented Aug 6, 2022 at 11:53

Asbjørn Olav Orvedal · Accepted Answer · 2022-08-06 12:20:23Z

3

(EDIT: added output)

I suggest completing this in two steps:

Create a random matrix
Normalize each column

1. Create random matrix

Let's say you want a 3 by 3 random transition matrix:

M = np.random.rand(3, 3)

Each of M's entries will have a random value between 0 and 1.

Normalize M's columns

By dividing each column by the column sum will achieve what you want. This can be done in several ways, but I prefer to create an array r whose elements is the column sum of M:

r = M.sum(axis=0)

Then, divide M by r:

transition_matrix = M / r

Example output

>>> import numpy as np

>>> M = np.random.rand(3,3 )
>>> r = M.sum(axis=0)
>>> transition_matrix = M / r

>>> M
array([[0.74145687, 0.68389986, 0.37008102],
       [0.81869654, 0.0394523 , 0.94880781],
       [0.93057194, 0.48279246, 0.15581823]])
>>> r
array([2.49072535, 1.20614462, 1.47470706])
>>> transition_matrix
array([[0.29768713, 0.56701315, 0.25095223],
       [0.32869804, 0.03270943, 0.64338731],
       [0.37361483, 0.40027743, 0.10566046]])
>>> transition_matrix.sum(axis=0)
array([1., 1., 1.])

edited Aug 6, 2022 at 12:20

answered Aug 6, 2022 at 12:09

Asbjørn Olav Orvedal

461 silver badge5 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Severin Pappadeux Over a year ago

You recognize that distribution of those numbers would be ... what?

Asbjørn Olav Orvedal Over a year ago

As described in the docs, numpy.random.rand uses a uniform distribution over [0, 1).

Severin Pappadeux Over a year ago

and what kind of distribution would be for values (in simplest case of 2x2 matrix) like X1/(X1+X2) where X1,X2 are both U(0,1) ? I put update in my answer to discuss the problem

Severin Pappadeux · Accepted Answer · 2022-08-08 15:03:38Z

You could use KNOWN distribution where each sample would have (by default) summed to one, e.g. Dirichlet distribution.

After that code is basically one liner, Python 3.8, Windows 10 x64

import numpy as np

N = 3

# set alphas array, 1s by default
a = np.empty(N)
a.fill(1.0)

mtx = np.random.dirichlet(a, N).transpose()

print(mtx)

and it will print something like

[[0.56634637 0.04568052 0.79105779]
 [0.42542107 0.81892862 0.02465906]
 [0.00823256 0.13539087 0.18428315]]

UPDATE

For the case of "sample something and normalize", problem is one would get value from unknown distribution. For Dirichlet there are expressions for mean, std.dev, PDF, CDF, you name it.

Even for the case with X_i sampled from U(0,1) what would be distribution of values for X_i/Sum(i, X_i).

Anything to say about mean? std.dev? PDF? Other stat properties?

You could sample from exponential and get sum normalized to 1, but question would be even more acute - if X_i is Exp(1), what is the distribution for X_i/Sum(i, X_i) ? PDF? Mean? Std.dev?

Collectives™ on Stack Overflow

Random matrix with sum of values by column = 1 in python

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related