Sort one array by columns of another array - Python

Question

I have two arrays, a and b, as follows:

a = array([[19.        ,  0.84722222],
           [49.        ,  0.86111111],
           [54.        ,  0.86666667],
           [42.        ,  0.9       ],
           [ 7.        ,  0.91111111],
           [46.        ,  0.99722222]])

b = array([[46.        ,  0.46944444],
       [49.        ,  0.59722222],
       [19.        ,  0.63611111],
       [42.        ,  0.72777778],
       [54.        ,  0.74722222],
       [ 7.        ,  0.98888889]])

I would like to sort b so that its first column matches the first column of array a. My output should be

b = array([[19.        ,  0.63611111],
           [49.        ,  0.59722222],
           [54.        ,  0.74722222],
           [42.        ,  0.72777778],
           [ 7.        ,  0.98888889]
           [46.        ,  0.46944444]])

Nice question. You're allowed to change your selected answer by the way. — Mad Physicist
– Mad Physicist, Commented Nov 23, 2019 at 18:36

Mad Physicist · Accepted Answer · 2019-11-23 18:18:47Z

6

Conceptually you want to get the indices that will turn column zero of b into column zero of a. Imagine doing argsort on both. This will give you the indices to go from a or b to a sorted state. Now if you apply the inverse operation to the a index, it will tell you how to get from sorted back to a. As it happens, argsort is its own inverse. So I present you the following:

index = np.argsort(b[:, 0])[np.argsort(np.argsort(a[:, 0]))]
b = b[index, ...]

This is O(n log n) time complexity because of the three sorts. The other solutions here are O(n^2) since they perform a linear search for each index.

edited Nov 23, 2019 at 18:18

answered Nov 23, 2019 at 18:12

Mad Physicist

116k29 gold badges202 silver badges292 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

wwii Over a year ago

... I was just going after an argsort solution .

Mad Physicist Over a year ago

@wwii. Nice. Nothing like a triple argsort to keep you interested :)

wwii Over a year ago

hmm, mine doesn't work if there are duplicate values in the first column. deleting..

Max Power · Accepted Answer · 2019-11-25 16:27:20Z

0

I think the most conceptually simple way to approach this is with a simple merge/join. Piggy-backing on your array definitions of a and b...

import pandas as pd

# convert arrays to Pandas DataFrames
df_a = pd.DataFrame(a, columns=['id', 'values_a'])
df_b = pd.DataFrame(b, columns=['id', 'values_b'])

# Merge in the values from b, into the table (and order) in a
df_merged = df_a.merge(df_b, how='left', on='id')

# Here's the two columns you want (in desired order) as a 2d numpy array via .values
answer = df_merged[['id', 'values_b']].values

...I find using DataFrames for these sorts of tasks makes everything clearer and debugging much easier whenever I encounter unexpected results

edited Nov 25, 2019 at 16:27

answered Nov 23, 2019 at 18:51

Max Power

9,13616 gold badges63 silver badges109 bronze badges

Comments

BalrogOfMoria · Accepted Answer · 2019-11-23 17:46:35Z

-1

I am supposing a and b have the same dimension and that the the first column contains the same set of elements.

import numpy as np

def same_order(a, b):
    new_pos = np.full(shape = a.shape[0], fill_value = -1)
    for i in range(new_pos.shape[0]):
        new_pos[np.where(a[:,0] == b[i,0])[0][0]] = i
    return b[new_pos]

answered Nov 23, 2019 at 17:46

BalrogOfMoria

1347 bronze badges

2 Comments

Mad Physicist Over a year ago

This is not a very good way to do it because it discards the benefits of numpy.

BalrogOfMoria Over a year ago

indeed, nevertheless if you do not speculate on performances then it is good enough

Collectives™ on Stack Overflow

Sort one array by columns of another array - Python

3 Answers 3

3 Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related