Numpy: Flatten some columns of an 2 D array

Question

Suppose I have a numpy array as below

a = np.asarray([[1,2,3],[1,4,3],[2,5,4],[2,7,5]])

array([[1, 2, 3],
       [1, 4, 3],
       [2, 5, 4],
       [2, 7, 5]])

How can I flatten column 2 and 3 for each unique element in column 1 like below:

array([[1, 2, 3, 4, 3],
       [2, 5, 4, 7, 5],])

Thank you for your help.

What happens if the resulting rows don't have the same length? — Julien
– Julien, Commented Jul 8, 2016 at 1:48
Each unique element in column 1 has a fixed number of rows so the resulting rows will have the same length. — Allen Qin
– Allen Qin, Commented Jul 8, 2016 at 1:55
I doubt numpy will have a builtin function for such a specific case. You can probably use pandas though. Or just write your own function. Have you tried anything? — Julien
– Julien, Commented Jul 8, 2016 at 1:57

akuiper · Accepted Answer · 2016-07-08 02:04:25Z

2

Another option using list comprehension:

np.array([np.insert(a[a[:,0] == k, 1:].flatten(), 0, k) for k in np.unique(a[:,0])])

# array([[1, 2, 3, 4, 3],
#        [2, 5, 4, 7, 5]])

answered Jul 8, 2016 at 2:04

akuiper

216k33 gold badges362 silver badges379 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

John1024 · Accepted Answer · 2016-07-08 02:03:18Z

2

import numpy as np
a = np.asarray([[1,2,3],[1,4,3],[2,5,4],[2,7,5]])
d = {}
for row in a:
    d[row[0]] = np.concatenate( (d.get(row[0], []), row[1:]) ) 
r = np.array([np.concatenate(([key], d[key])) for key in d])
print(r)

This prints:

[[ 1.  2.  3.  4.  3.]
 [ 2.  5.  4.  7.  5.]]

answered Jul 8, 2016 at 2:03

John1024

115k15 gold badges152 silver badges183 bronze badges

Comments

Divakar · Accepted Answer · 2016-07-08 05:22:05Z

Since as posted in the comments, we know that each unique element in column-0 would have a fixed number of rows and by which I assumed it was meant same number of rows, we can use a vectorized approach to solve the case. We sort the rows based on column-0 and look for shifts along it, which would signify group change and thus give us the exact number of rows associated per unique element in column-0. Let's call it L. Finally, we slice sorted array to select columns-1,2 and group L rows together by reshaping. Thus, the implementation would be -

sa = a[a[:,0].argsort()]
L = np.unique(sa[:,0],return_index=True)[1][1]
out = np.column_stack((sa[::L,0],sa[:,1:].reshape(-1,2*L)))

For more performance boost, we can use np.diff to calculate L, like so -

L = np.where(np.diff(sa[:,0])>0)[0][0]+1

Sample run -

In [103]: a
Out[103]: 
array([[1, 2, 3],
       [3, 7, 8],
       [1, 4, 3],
       [2, 5, 4],
       [3, 8, 2],
       [2, 7, 5]])

In [104]: sa = a[a[:,0].argsort()]
     ...: L = np.unique(sa[:,0],return_index=True)[1][1]
     ...: out = np.column_stack((sa[::L,0],sa[:,1:].reshape(-1,2*L)))
     ...: 

In [105]: out
Out[105]: 
array([[1, 2, 3, 4, 3],
       [2, 5, 4, 7, 5],
       [3, 7, 8, 8, 2]])

Collectives™ on Stack Overflow

Numpy: Flatten some columns of an 2 D array

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related