0

I want to randomly choose from an array but the requirement is that the elements of the output array will increase by one (and start at zero). For example, if I want to get 5 numbers between 0 and 5 then one could do

np.random.choice(np.arange(6), 5)
array([5, 0, 5, 2, 5])

where, in this case, I would like this to be:

array([2, 0, 2, 1, 2])

Another example, if

np.random.choice(np.arange(6), 5)
array([1, 1, 1, 4, 2])

I am trying to "rebase" this in such a manner that it will be

array([0, 0, 0, 2, 1])

Final example...select 15 numbers between 0 and 5

np.random.choice(np.arange(6), 15)
array([4, 5, 3, 0, 4, 5, 3, 0, 2, 5, 2, 3, 2, 4, 4])

where eventually I want to end up with

array([3, 4, 2, 0, 3, 4, 2, 0, 1, 4, 1, 2, 1, 3, 3])
2
  • Why are you generating random values between 0 and 5 only to rebase it later? Why not just generate between 0 and 2, or 0 and 4? Commented Feb 24, 2018 at 13:14
  • @SeanBreckenridge How would you do that and ensure there are no gaps? Commented Feb 24, 2018 at 13:22

3 Answers 3

4

What you're looking to do is replace each entry in your original array x by its index in the array of unique elements of x (in sorted order). For example, if x is np.array([7, 6, 2, 7, 7, 2]), the unique elements of x are [2, 6, 7], and we want to replace each number with its position in that array of unique elements: that is, replace each 2 with 0, each 6 with 1 and each 7 with 2.

The numpy.unique function does both these jobs: it finds the (sorted) array of unique elements for you, and if you pass return_inverse=True, np.unique will also give you a second return value that contains exactly the indices you're after. So all you need to do is call np.unique with return_inverse=True, throw away the first return value, and keep the second. Examples:

>>> import numpy as np
>>> np.unique([5, 0, 5, 2, 5], return_inverse=True)[1]
array([2, 0, 2, 1, 2])
>>> x = np.array([4, 5, 3, 0, 4, 5, 3, 0, 2, 5, 2, 3, 2, 4, 4])
>>> np.unique(x, return_inverse=True)[1]
array([3, 4, 2, 0, 3, 4, 2, 0, 1, 4, 1, 2, 1, 3, 3])
Sign up to request clarification or add additional context in comments.

1 Comment

Brilliant! Many thanks Mark, that was new to me. Your comments are very helpful. 6502's answer below shows a similar approach, yours is more self-contained. Thanks again
2

What you could do is starting from a randomly chosen array

x = np.random.choice(np.arange(6), 5)

then collect the unique values and sorting them

v = sorted(set(x))

then map the original value to the index in v:

result = [v.index(y) for y in x]

Comments

0

If your original sequence contains only unique elements, sorting based approaches like np.unique are actually a bit wasteful at O(n log n) since an O(n) solution is available (assuming n >= k where k is the size of the set to choose from):

>>> import numpy as np
>>>
to_choose_from = [1, 5, 7, 9, 10, 'hello', ()]
>>> n = 12
>>> 
>>> k = len(to_choose_from)
# make sure no duplicates - skip this if you happen to know
>>> assert len(set(to_choose_from)) == k
>>> 
>>> chc = np.random.randint(0, k, (n,))
>>> chc
array([4, 4, 1, 5, 3, 1, 5, 5, 6, 1, 6, 6])
>>> 
>>> occur = np.zeros((k,), int)
>>> occur[chc] = 1
>>> idx, = np.where(occur)
>>> occur[idx] = np.arange(idx.size)
>>> result = occur[chc]
>>> result
array([2, 2, 0, 3, 1, 0, 3, 3, 4, 0, 4, 4])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.