3

I have two arrays, a and b, as follows:

a = array([[19.        ,  0.84722222],
           [49.        ,  0.86111111],
           [54.        ,  0.86666667],
           [42.        ,  0.9       ],
           [ 7.        ,  0.91111111],
           [46.        ,  0.99722222]])

b = array([[46.        ,  0.46944444],
       [49.        ,  0.59722222],
       [19.        ,  0.63611111],
       [42.        ,  0.72777778],
       [54.        ,  0.74722222],
       [ 7.        ,  0.98888889]])

I would like to sort b so that its first column matches the first column of array a. My output should be

b = array([[19.        ,  0.63611111],
           [49.        ,  0.59722222],
           [54.        ,  0.74722222],
           [42.        ,  0.72777778],
           [ 7.        ,  0.98888889]
           [46.        ,  0.46944444]])
2
  • b[np.where(a[:,None] == b[None, :])[1]] Commented Nov 23, 2019 at 17:57
  • Nice question. You're allowed to change your selected answer by the way. Commented Nov 23, 2019 at 18:36

3 Answers 3

6

Conceptually you want to get the indices that will turn column zero of b into column zero of a. Imagine doing argsort on both. This will give you the indices to go from a or b to a sorted state. Now if you apply the inverse operation to the a index, it will tell you how to get from sorted back to a. As it happens, argsort is its own inverse. So I present you the following:

index = np.argsort(b[:, 0])[np.argsort(np.argsort(a[:, 0]))]
b = b[index, ...]

This is O(n log n) time complexity because of the three sorts. The other solutions here are O(n^2) since they perform a linear search for each index.

Sign up to request clarification or add additional context in comments.

3 Comments

... I was just going after an argsort solution .
@wwii. Nice. Nothing like a triple argsort to keep you interested :)
hmm, mine doesn't work if there are duplicate values in the first column. deleting..
0

I think the most conceptually simple way to approach this is with a simple merge/join. Piggy-backing on your array definitions of a and b...

import pandas as pd

# convert arrays to Pandas DataFrames
df_a = pd.DataFrame(a, columns=['id', 'values_a'])
df_b = pd.DataFrame(b, columns=['id', 'values_b'])

# Merge in the values from b, into the table (and order) in a
df_merged = df_a.merge(df_b, how='left', on='id')

# Here's the two columns you want (in desired order) as a 2d numpy array via .values
answer = df_merged[['id', 'values_b']].values

...I find using DataFrames for these sorts of tasks makes everything clearer and debugging much easier whenever I encounter unexpected results

Comments

-1

I am supposing a and b have the same dimension and that the the first column contains the same set of elements.

import numpy as np

def same_order(a, b):
    new_pos = np.full(shape = a.shape[0], fill_value = -1)
    for i in range(new_pos.shape[0]):
        new_pos[np.where(a[:,0] == b[i,0])[0][0]] = i
    return b[new_pos]

2 Comments

This is not a very good way to do it because it discards the benefits of numpy.
indeed, nevertheless if you do not speculate on performances then it is good enough

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.