2

I have a numpy array, and I would like to shuffle parts of it. For example, with the following array:

import numpy as np
import random

a = np.arange(15)
# => array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

I want to do:

shuffle_parts(a, [(0, 3), (10, 13)])
# => array([ 2,  0,  1,  3,  4,  5,  6,  7,  8,  9, 12, 11, 10, 13, 14])
#            ^^^^^^^^^                              ^^^^^^^^^^
#            Shuffle those 3 values                 and those 3 values

The following would shuffle all the array: (Not what I want)

random.shuffle(a) 
# => array([10, 11,  8,  1, 13,  5,  9, 14,  4,  7,  2, 12,  3,  0,  6])

One way would be to use split / concatenate like so:

splits = np.split(a, 5)
random.shuffle(splits[0])
random.shuffle(splits[3])
np.concatenate(splits)
# => array([ 2,  0,  1,  3,  4,  5,  6,  7,  8, 11, 10, 9, 12, 13, 14])
#            ^^^^^^^^^                          ^^^^^^^^^^
#            Correctly shuffled                 Shuffled but off by 1 index

This is almost what I want. My questions:

  • Can I write shuffle_parts where the indices are custom (parts with arbitrary indices, not restricted to modulos, and parts with varying length)
  • Is there a method in numpy that I missed and that would help me do that?
1
  • Just shuffle sliced views of the array, for instance np.random.shuffle(a[0:3]) Commented Mar 12, 2019 at 13:35

2 Answers 2

3

It can be done directly:

>>> import numpy as np
>>> import random
>>> a = np.arange(15)
>>> s=3
>>> f=7
>>> random.shuffle(a[s:f])
>>> a
array([ 0,  1,  2,  5,  4,  3,  6,  7,  8,  9, 10, 11, 12, 13, 14])

Indexing directly references the data, making this possible.

Sign up to request clarification or add additional context in comments.

4 Comments

My code is running inside workers running in parallel, and I've noticed that all random values end up the same (for a given work batch). If I init the random seed with the worker index, that solves this problem. I don't know if this what you are warning about.
@BenjaminCrouzier No, the random algorithm itself is "insecure". I do not think it should matter though for programs that just need some randomness in them. I noted it in case you want to read more.
You're confusing randomness and security. Python (and by extension numpy) use a pseudo random number generator, so it is not secure for cryptographic purposes. For anything else, it's just fine. This is not a weakness or drawback of the library or language - it's the same in any other language. If you need truly random numbers, they have to come from a truly random source rather than an algorithm (and if you don't know if you do, you don't).
@user2699 Very well, I did not want to go into such depth, but the end of your comment I think drives your point - I'll edit out the note.
3

numpy slices are views on the data below; so you can directly shuffle the slices:

import numpy as np
import random

a = np.arange(15)

random.shuffle(a[0:3])
random.shuffle(a[10:13])
print(a)
# [ 2  0  1  3  4  5  6  7  8  9 12 10 11 13 14]

you could implement your shuffle_parts function using slice this way then:

def shuffle_parts(array, slices):
    for s in slices:
        random.shuffle(a[slice(*s)])

shuffle_parts(array=a, slices=((0, 3), (10, 13)))

or (depending on how you want to pass the slices to your function):

def shuffle_parts(array, slices):
    for s in slices:
        random.shuffle(a[s])

shuffle_parts(array=a, slices=(slice(0, 3), slice(10, 13)))

personally i'd prefer the second version (that way you could also e.g. shuffle the even indices: shuffle_parts(array=a, slices=(slice(None, None, 2), )))...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.