0

I am trying to implement K-means by selective centroid selection. I have two numpy arrays, one called "features" which has a set of numpy arrays where each array is a datapoint and another np array called "labels", which has the label of class the data point at an index "i" belongs to. I have datapoints related to 4 different classes. What I want to do is to make use of both these numpy arrays, and randomly pick a datapoint one from each class. Could you please help me out with this. Also, is there any way to zip two numpy arrays into a dictionary?

for example I have the features array as :

[[1,1,1],[1,2,3],[1,6,7],[1,4,6],[1,6,9],[1,4,2]] and my labels array is [1,2,2,3,1,3]

For each value unique in the labels numpy array, I want one randomly chosen corresponding element in the features array. A sample answer would be :

[1,1,1] from class 1
[1,6,7] from class 2
[1,4,2] from class 3
2
  • 2
    Please, provide an example of your problem and the result you want to achieve. Commented Apr 26, 2019 at 20:22
  • I updated my question with an example. Please check Commented Apr 26, 2019 at 20:46

3 Answers 3

1

Given this is the setup in your question:

import numpy as np
features = [[1,1,1],[1,2,3],[1,6,7],[1,4,6],[1,6,9],[1,4,2]]
labels = np.array([1,2,2,3,1,3])

This should get you a random variable from each label in dictionary form:

features_index = np.array(range(0, len(features)))
unique_labels = np.unique(labels)
rand = []
for n in unique_labels:
    rand.append(features[np.random.choice(features_index[labels == n])])
dict(zip(unique_labels, rand))
Sign up to request clarification or add additional context in comments.

Comments

0

Try:

import numpy as np

features = np.array([[1,1,1],[1,2,3],[1,6,7],[1,4,6],[1,6,9],[1,4,2]])
labels = np.array([1,2,2,3,1,3])

res = {i: features[np.random.choice(np.where(labels == i)[0])] for i in set(labels)}

output

{1: array([1, 1, 1]), 2: array([1, 2, 3]), 3: array([1, 4, 2])}

Comments

0

You can accomplish this with a bit of indexing and numpy.unique


u = np.unique(labels)
f = np.arange(features.shape[0])

idx = np.random.choice(
    f, u.shape[0], replace=False
)

dict(zip(u, features[idx]))

{1: array([1, 4, 2]), 2: array([1, 6, 9]), 3: array([1, 1, 1])}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.