6

Let's suppose I have an array as such:

np.array([1., 1., 0.],
       [0., 4., 0.],
       [8., 0., 8.],
       [0., 0., 0.],
       [5., 0., 0.],
       [2., 2., 2.]])

With column[0] summing to 16, column[1] to 6 and column[2] to 10.

How do I efficiently in Numpy re-arrange the array by column value greatest to least? In the above example, column[0] would remain in place and column[1] and column[2] would switch positions.

1
  • Also can try np.array(list(zip(*sorted(zip(*arr), key=sum,reverse=True)))) Commented Sep 7, 2018 at 6:46

6 Answers 6

7

You can try sum along axis=0 and use argsort then reverse the array and use:

a[:,np.argsort(a.sum(axis=0))[::-1]]

array([[1., 0., 1.],
       [0., 0., 4.],
       [8., 8., 0.],
       [0., 0., 0.],
       [5., 0., 0.],
       [2., 2., 2.]])
Sign up to request clarification or add additional context in comments.

Comments

1

Using a combination of np.sum and np.argsort you can achieve this as follows:

x = np.array([[1., 1., 0.],[0., 4., 0.],[8., 0., 8.],[0., 0., 0.],[5., 0., 0.],[2., 2., 2.]])
x[:, np.argsort(-np.sum(x, 0))]
array([[ 1.,  0.,  1.],
       [ 0.,  0.,  4.],
       [ 8.,  8.,  0.],
       [ 0.,  0.,  0.],
       [ 5.,  0.,  0.],
       [ 2.,  2.,  2.]])

1 Comment

-np.sum(x, 0)) is simpler than flipping the indices in the end. nice.
1

Swapping the last two columns is done this way:

a = np.array([[1., 1., 0.],
             [0., 4., 0.],
             [8., 0., 8.],
             [0., 0., 0.],
             [5., 0., 0.],
             [2., 2., 2.]])

result = a[:, [0, 2, 1]]

So, what you need is to calculate those indexes [0, 2, 1] based on column sums.

This gets you the sums of all columns:

a.sum(axis=0)  # array([16.,  7., 10.])

and from that, you get the indices for sorting:

np.argsort(np.array([16.,  7., 10.]))   # [1, 2, 0]

You need to flip it to get the highest-to-lowest order:

np.flip([1, 2, 0])   # [0, 2, 1]

So, all together, it is:

result = a[:, np.flip(np.argsort(a.sum(axis=0)))]

Comments

0

You can do something like this:

def main():
    a = np.array([[1., 1., 0.],
                 [0., 4., 0.],
                 [8., 0., 8.],
                 [0., 0., 0.],
                 [5., 0., 0.],
                 [2., 2., 2.]])
    col_sum = np.sum(a, axis=0)
    sort_index = np.argsort(-col_sum) # index sort in descending order
    out_matrix = a[:, sort_index]
    print(out_matrix)

I think that a new instance (out_matrix) is necessary because you can't really switch columns inplace.

Comments

0
arr = np.array([[1., 1., 0.],
                [0., 4., 0.],
                [8., 0., 8.],
                [0., 0., 0.],
                [5., 0., 0.],
                [2., 2., 2.]])

perm = np.flip(np.argsort(np.sum(arr, axis=0)))
result = a[:, perm]

Get the sums; then get the permutation (array of indices) which sorts the sums. argsort sorts in ascending order, so reverse the permutation so we get indices from highest sum to lowest. Finally, reorder the original array by the same permutation.

1 Comment

@AkshayNevrekar: Right. Thanks.
0

Or you can use pandas:

>>> import pandas as pd, numpy as np
>>> arr=np.array([[1., 1., 0.],
       [0., 4., 0.],
       [8., 0., 8.],
       [0., 0., 0.],
       [5., 0., 0.],
       [2., 2., 2.]])
>>> df=pd.DataFrame(arr)
>>> df.sort_index(axis=1).values
array([[ 1.,  1.,  0.],
       [ 0.,  4.,  0.],
       [ 8.,  0.,  8.],
       [ 0.,  0.,  0.],
       [ 5.,  0.,  0.],
       [ 2.,  2.,  2.]])
>>> 

2 Comments

Here is a loop which is executed in python, rather than in numpy: sorted(df.columns.tolist(),key=lambda x: df[x].sum(),reverse=True). That is not efficient. The whole point of using numpy is not to make loops in python.
@zvone Edited mine, much cleaner

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.