How to merge 2 numpy ndarray on a ndarray using values of a column?

Question

I have 2 ndarray:

a = np.array([[1,2], [5,0], [6,4]])
b = np.array([[1,10],[6,30], [5,20]])

I wish merge them in a array as this:

[[ 1  2 10]
 [ 5  0 20]
 [ 6  4 30]]

Someone knows a not iterative mode to merge 2 array by values of column 0?

I've found only this way:

import numpy as np

a = np.array([[1,2], [5,0], [6,4]])
b = np.array([[1,10],[6,30], [5,20]])
new0col = np.zeros((a.shape[0],1), dtype=int)
a = np.append(a, new0col, axis=1)
l1 = a[:,0].tolist()
l2 = b[:,0].tolist()
for i in l2:
    a[l1.index(i),2] = b[l2.index(i),1]
print(a)

Possible duplicate of SQL join or R's merge() function in NumPy? — Georgy
– Georgy, Commented Apr 22, 2019 at 10:28

jpp · Accepted Answer · 2018-07-16 23:41:18Z

1

You can use numpy.searchsorted:

c = np.c_[a, b[np.searchsorted(a[:, 0], b[:, 0]), 1]]

print(c)

array([[ 1,  2, 10],
       [ 5,  0, 20],
       [ 6,  4, 30]])

Breaking this down, note the row indexing applied to b retrieves the indices of a[:, 0] for each value in b[:, 0]:

print(np.searchsorted(a[:, 0], b[:, 0]))

[0 2 1]

answered Jul 16, 2018 at 23:41

jpp

166k37 gold badges301 silver badges362 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Stefano G. · Accepted Answer · 2018-07-17 14:47:07Z

I've found an alternative solution with pandas, is less effcient than numpy, but I wish post it too, because i think that is instructive. The good solution that give to me jpp (I did not know that method), has a limit, a and b must have same keys.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pandas as pd
import numpy as np

def merge_w_np(a, b):
    zeros = np.zeros((a.shape[0], np.shape(b)[1] -1), dtype=int)
    a = np.append(a, zeros, axis=1)
    l1 = a[:,0].tolist()
    for j, i in enumerate(b[:,0].tolist()):
        a[l1.index(i),2] = b[j,1]
    print(a)

def merge_w_pd(a, b):
    dfa = pd.DataFrame(data=a,                      # values
                       index=a[:,0])                # 1st column as index
    dfb = pd.DataFrame(data=b,                      # values
                       index=b[:,0])                # 1st column as index
    dfa.columns = ['id', 'value']
    dfb.columns = ['id', 'value']
    # print('a',dfa)
    # print('b',dfb)
    dfc = dfa.merge(dfb, left_on='id', right_on='id', how='outer')
    print(dfc)

a = np.array([[1,2], [2,8], [5,0], [6,4], [7,9]])
b = np.array([[1,10],[6,30], [5,20]])
merge_w_np(a, b)
merge_w_pd(a, b)

Collectives™ on Stack Overflow

How to merge 2 numpy ndarray on a ndarray using values of a column?

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related