I have a numpy array as follows:
array = np.random.randint(6, size=(50, 400))
This array has the cluster that each value belongs to, with each row representing a sample and each column representing a feature, but I would like to create a 5 dimensional array with the frequency of each cluster (in each sample, represented as a row in this matrix).
However, in the frequency calculation, I want to ignore 0, meaning that the frequency of all values except 0 (1-5) should add to 1.
Essentially what I want is a array with each row being a cluster (1-5) in this case, and each row still contains a single sample.
How can this be done?
Edit:
small input:
input = np.random.randint(6, size=(2, 5))
array([[0, 4, 2, 3, 0],
[5, 5, 2, 5, 3]])
output:
1 2 3 4 5
0 .33 .33 .33 0
0 .2 .2 0 .6
Where 1-5 are the row names, and the bottom two rows are the desired output in a numpy array.
(5,)?