32

I'm trying to convert all my codes to Python. I want to sort an array which has two columns so that the sorting must be based on the 2th column in the ascending order. Then I need to sum the first column data (from first line to, for example, 100th line). I used "Data.sort(axis=1)", but it doesn't work. Does anyone have any idea to solve this problem?

1
  • you can use np.partition if all you need is top-k elements. No need to sort everything. Commented Aug 5, 2024 at 11:59

2 Answers 2

67

Use .argsort() it returns an numpy.array of indices that sort the given numpy.array. You call it as a function or as a method on your array. For example, suppose you have

import numpy as np

arr = np.array([[-0.30565392, -0.96605562],
                [ 0.85331367, -2.62963495],
                [ 0.87839643, -0.28283675],
                [ 0.72676698,  0.93213482],
                [-0.52007354,  0.27752806],
                [-0.08701666,  0.22764316],
                [-1.78897817,  0.50737573],
                [ 0.62260038, -1.96012161],
                [-1.98231706,  0.36523876],
                [-1.07587382, -2.3022289 ]])

You can now call .argsort() on the column you want to sort, and it will give you an array of row indices that sort that particular column which you can pass as an index to your original array.

>>> arr[arr[:, 1].argsort()]
array([[ 0.85331367, -2.62963495],
       [-1.07587382, -2.3022289 ],
       [ 0.62260038, -1.96012161],
       [-0.30565392, -0.96605562],
       [ 0.87839643, -0.28283675],
       [-0.08701666,  0.22764316],
       [-0.52007354,  0.27752806],
       [-1.98231706,  0.36523876],
       [-1.78897817,  0.50737573],
       [ 0.72676698,  0.93213482]])

You can equivalently use numpy.argsort()

>>> arr[np.argsort(arr[:, 1])]
array([[ 0.85331367, -2.62963495],
       [-1.07587382, -2.3022289 ],
       [ 0.62260038, -1.96012161],
       [-0.30565392, -0.96605562],
       [ 0.87839643, -0.28283675],
       [-0.08701666,  0.22764316],
       [-0.52007354,  0.27752806],
       [-1.98231706,  0.36523876],
       [-1.78897817,  0.50737573],
       [ 0.72676698,  0.93213482]])
Sign up to request clarification or add additional context in comments.

Comments

5

sorted(Data, key=lambda row: row[1]) should do it.

2 Comments

Using this command, I have same problem as before, which is duplicating in sorting. If the input data is: Data=[1.0 0.70 0.0 0.69 3.0 0.57 0.0 0.68 1.0 0.56 2.0 0.51] The sorting results are: [[0.0', '0.68'], ['0.0', '0.69'], ['0.70', '1.0'],['0.56', '1.0'], ['0.51', '2.0'], ['0.57', '3.0'] Do you have another idea?
I'm afraid I don't quite understand the problem. What's duplicated? If your input Data is a flat list, why would sorting it result in a list of lists?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.