4
np.array([1,2,3])

I've got numpy array. I would like to turn it into a numpy array with tuples of each 1:1 permutation. Like this:

np.array([
    [(1,1),(1,2),(1,3)],
    [(2,1),(2,2),(2,3)],
    [(3,1),(3,2),(3,3)],
])

Any thoughts on how to do this efficiently? I need to do this operation a few million times.

0

5 Answers 5

6

You can do something like this:

>>> a = np.array([1, 2, 3])
>>> n = a.size
>>> np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

Or as suggested by @Jaime you can get around 10x speedup if we take advantage of broadcasting here:

>>> a = np.array([1, 2, 3])
>>> n = a.size                 
>>> perm = np.empty((n, n, 2), dtype=a.dtype)
perm[..., 0] = a[:, None]
perm[..., 1] = a
... 
>>> perm
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

Timing comparisons:

>>> a = np.array([1, 2, 3]*100)
>>> %%timeit                   
np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
... 
1000 loops, best of 3: 934 µs per loop
>>> %%timeit                   
perm = np.empty((n, n, 2), dtype=a.dtype)                     
perm[..., 0] = a[:, None]
perm[..., 1] = a
... 
10000 loops, best of 3: 111 µs per loop
Sign up to request clarification or add additional context in comments.

1 Comment

You can get a big speed-up (10x for a 100 item array) if you stay away from the tiling and repeating an use broadcasting: perm = np.empty((n, n, 2), dtype=a.dtype); perm[..., 0] = a; perm[..., 1] = a[:. None]
5

If you're working with numpy, don't work with tuples. Use its power and add another dimension of size two. My recommendation is:

x = np.array([1,2,3])
np.vstack(([np.vstack((x, x, x))], [np.vstack((x, x, x)).T])).T

or:

im = np.vstack((x, x, x))
np.vstack(([im], [im.T])).T

And for a general array:

ix = np.vstack([x for _ in range(x.shape[0])])
return np.vstack(([ix], [ix.T])).T

This will produce what you want:

array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

But as a 3D matrix, as you can see when looking at its shape:

Out[25]: (3L, 3L, 2L)

This is more efficient than the solution with permutations as the array size get's bigger. Timing my solution against @Kasra's yields 1ms for mine vs. 46ms for the one with permutations for an array of size 100. @AshwiniChaudhary's solution is more efficient though.

Comments

2

Yet another way using numpy.meshgrid.

>>> x = np.array([1, 2, 3])
>>> perms = np.stack(np.meshgrid(x, x))
>>> perms
array([[[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]],

       [[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]]])
>>> perms.transpose().reshape(9, 2)
array([[1, 1],
       [1, 2],
       [1, 3],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 2],
       [3, 3]])

2 Comments

I'm not sure what magic the transpose and reshape are doing yet, but this was able to give me a list of permutations to shove back into a predictor model then reshape that to make a pretty 3D graph of my dataset.
Not really magic, just reshaping the results from a 3D array to a 2D. But this is not necessary. The result of meshgrid is arguably closest to what the original question was asking for.
1

You can use itertools.product to get the permutations , then convert the result to numpy array.

>>> from itertools import product
>>> p=list(product(a,repeat=2))
>>> np.array([p[i:i+3] for i in range(0,len(p),3)])
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

1 Comment

I think this method may be the fastest. But why not just do np.array(p).reshape(3, 3, 2) in the second step?
1

I was looking into how to do this better in general, not just for 2-tuples. It can actually be done pretty elegantly using np.indices, which can be used to produce a set of indices to index the original array:

>>> x = np.array([1, 2, 3])
>>> i = np.indices((3, 3)).reshape(2, -1)
>>> a[i].T
array([[1, 1],
       [1, 2],
       [1, 3],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 2],
       [3, 3]])

The general case is done as follows: let n be the number of items in each permutation.

n = 5
x = np.arange(10)

i = np.indices([x.size for _ in range(n)]).reshape(n, -1)
a = x[i].T

Then you can reshape the result to the n-dimensional array form if needed, but often having the permutations is enough. I didn't test the performance of this method, but certainly native numpy calls and indexing ought to be pretty quick. At least this is more elegant than the other solutions in my opinion. And this is pretty similar to the meshgrid solution provided by @Bill.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.