3

Given a matrix A, I want to apply different random shuffles for different row of A; for example,

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

becomes

array([[1, 3, 2],
       [6, 5, 4],
       [7, 9, 8]])

Of course we can loop through the matrix and make every row randomly shuffle; however iteration is slow and I am asking if there is more efficient way to do this.

2
  • Another answer here. The comments there also suggest apply_along_axis . Another answer for columns is here and here and here Commented Jun 10, 2019 at 23:11
  • And one more here for column as well Commented Jun 10, 2019 at 23:15

2 Answers 2

5

Picked up this neat trick from Divakar which involves randn and argsort:

np.random.seed(0)

s = np.arange(16).reshape(4, 4)
np.take_along_axis(s, np.random.randn(*s.shape).argsort(axis=1), axis=1)

array([[ 1,  0,  3,  2],
       [ 4,  6,  5,  7],
       [11, 10,  8,  9],
       [14, 12, 13, 15]])

For a 2D array, this can be simplified to

s[np.arange(len(s))[:,None], np.random.randn(*s.shape).argsort(axis=1)]

array([[ 1,  0,  3,  2],
       [ 4,  6,  5,  7],
       [11, 10,  8,  9],
       [14, 12, 13, 15]])

You can also apply np.random.permutation over each row independently to return a new array.

np.apply_along_axis(np.random.permutation, axis=1, arr=s)

array([[ 3,  1,  0,  2],
       [ 4,  6,  5,  7],
       [ 8,  9, 10, 11],
       [15, 14, 13, 12]])

Performance -

s = np.arange(10000 * 100).reshape(10000, 100) 

%timeit s[np.arange(len(s))[:,None], np.random.randn(*s.shape).argsort(axis=1)] 
%timeit np.apply_along_axis(np.random.permutation, 1, s)   

84.6 ms ± 857 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
842 ms ± 8.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I've noticed it depends on the dimensions of your data, make sure to test it out first.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! So if I got a 3D array and if I want to permute the last dimension, then I can do np.take_along_axis(s, np.random.randn(*s.shape).argsort(axis=2), axis=2), right?
@Tony Yes, I think that should work.
0

Codewise you can use numpy's apply_along_axis as

np.apply_along_axis(np.random.shuffle, 1, matrix)

but it doesn't seem to be more efficient than iterating at least for a 3x3 matrix, for that method I get

> %%timeit 
> np.apply_along_axis(np.random.shuffle, 1, test)
67 µs ± 1.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

while the iteration gives

> %%timeit
> for i in range(test.shape[0]):
>     np.random.shuffle(test[i])
20.3 µs ± 284 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

1 Comment

apply_along_axis is essentially just iterate over the 'other' axes. No speed promises. It makes iteration prettier for 3d and larger; does nothing for 2d.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.