3

I have a numpy array x (with (n,4) shape) of integers like:

[[0 1 2 3],
[1 2 7 9],
[2 1 5 2],
...]

I want to transform the array into an array of pairs:

[0,1]
[0,2]
[0,3]
[1,2]
...

so first element makes a pair with other elements in the same sub-array. I have already a for-loop solution:

y=np.array([[x[j,0],x[j,i]] for i in range(1,4) for j in range(0,n)],dtype=int)

but since looping over numpy array is not efficient, I tried slicing as the solution. I can do the slicing for every column as:

y[1]=np.array([x[:,0],x[:,1]]).T
# [[0,1],[1,2],[2,1],...] 

I can repeat this for all columns. My questions are:

  1. How can I append y[2] to y[1],... such that the shape is (N,2)?
  2. If number of columns is not small (in this example 4), how can I find y[i] elegantly?
  3. What are the alternative ways to achieve the final array?
3
  • 2
    itertools.combinations() may be useful here. Commented Apr 12, 2015 at 2:43
  • 3
    @BlacklightShining: Not really. The pattern needed isn't actually combinations, though it might look that way at first due to poor choice of example data, and for NumPy, you want to avoid itertools about as much as you want to avoid loops and comprehensions. Commented Apr 12, 2015 at 3:20
  • 4
    itertools is rarely the best answer to a numpy question... Commented Apr 12, 2015 at 3:20

3 Answers 3

7

The cleanest way of doing this I can think of would be:

>>> x = np.arange(12).reshape(3, 4)
>>> x
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> n = x.shape[1] - 1
>>> y = np.repeat(x, (n,)+(1,)*n, axis=1)
>>> y
array([[ 0,  0,  0,  1,  2,  3],
       [ 4,  4,  4,  5,  6,  7],
       [ 8,  8,  8,  9, 10, 11]])
>>> y.reshape(-1, 2, n).transpose(0, 2, 1).reshape(-1, 2)
array([[ 0,  1],
       [ 0,  2],
       [ 0,  3],
       [ 4,  5],
       [ 4,  6],
       [ 4,  7],
       [ 8,  9],
       [ 8, 10],
       [ 8, 11]])

This will make two copies of the data, so it will not be the most efficient method. That would probably be something like:

>>> y = np.empty((x.shape[0], n, 2), dtype=x.dtype)
>>> y[..., 0] = x[:, 0, None]
>>> y[..., 1] = x[:, 1:]
>>> y.shape = (-1, 2)
>>> y
array([[ 0,  1],
       [ 0,  2],
       [ 0,  3],
       [ 4,  5],
       [ 4,  6],
       [ 4,  7],
       [ 8,  9],
       [ 8, 10],
       [ 8, 11]])
Sign up to request clarification or add additional context in comments.

Comments

2

Like Jaimie, I first tried a repeat of the 1st column followed by reshaping, but then decided it was simpler to make 2 intermediary arrays, and hstack them:

x=np.array([[0,1,2,3],[1,2,7,9],[2,1,5,2]])
m,n=x.shape
x1=x[:,0].repeat(n-1)[:,None]
x2=x[:,1:].reshape(-1,1)
np.hstack([x1,x2])

producing

array([[0, 1],
       [0, 2],
       [0, 3],
       [1, 2],
       [1, 7],
       [1, 9],
       [2, 1],
       [2, 5],
       [2, 2]])

There probably are other ways of doing this sort of rearrangement. The result will copy the original data in one way or other. My guess is that as long as you are using compiled functions like reshape and repeat, the time differences won't be significant.

Comments

1

Suppose the numpy array is

arr = np.array([[0, 1, 2, 3],
                [1, 2, 7, 9],
                [2, 1, 5, 2]])

You can get the array of pairs as

import itertools
m, n = arr.shape
new_arr = np.array([x for i in range(m) 
                    for x in itertools.product(a[i, 0 : 1], a[i, 1 : n])])

The output would be

array([[0, 1],
       [0, 2],
       [0, 3],
       [1, 2],
       [1, 7],
       [1, 9],
       [2, 1],
       [2, 5],
       [2, 2]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.