Python - return intersection of two arrays

Question

I have two arrays and I am trying to return a new array that equals the intersection of my original two arrays. The two original arrays should be of the same length. For example, if I have:

arr1 = np.array([(255, 255, 255), (255, 255, 255)])

arr2 = np.array([(255, 255, 255), (255, 255, 255)])

I should get:

intersectedArr = ([(255, 255, 255), (255, 255, 255])

However, if I have:

arr1 = np.array([(100, 100, 100), (255, 255, 255)])

arr2 = np.array([(255, 255, 255), (255, 255, 255)])

I should get

([(255, 255, 255)])

So far i've tried:

intersectedArr = np.intersect1d(arr1, arr2)

but this returns [255] instead of the expected ([(255, 255, 255)])

Can someone help? Thanks in advance!

Could you be more specific? how about ([(100,200,100),(100,100,100)]) and ([(100,200,200]),(200,100,200)])? — UnsignedByte
– UnsignedByte, Commented Nov 14, 2017 at 2:02
hi - the arrays should be of same length. I tried intersectedArr = np.intersect1d(arr1, arr2) — Trung Tran
– Trung Tran, Commented Nov 14, 2017 at 2:03
I mean, if you have [a,b,c] and [a,c,c] will it return [a,c] or nothing? — UnsignedByte
– UnsignedByte, Commented Nov 14, 2017 at 2:11

Matt Rundle · Accepted Answer · 2018-02-09 07:12:24Z

8

If you want to keep duplicates, like in your examples, you can use a list comprehension:

def intersection(list_a, list_b):
    return [ e for e in list_a if e in list_b ]

which produces:

in:
    [(255, 255, 255), (255, 255, 255)]
    [(255, 255, 255), (255, 255, 255)]
out:
    [(255, 255, 255), (255, 255, 255)]

in:
    [(100, 100, 100), (255, 255, 255)]
    [(255, 255, 255), (255, 255, 255)]
out:
    [(255, 255, 255)]

If you want uniquie combinations between the lists (sets) though:

def intersection(a, b):
    return list(set(a).intersection(b))

which produces:

in:
    [(255, 255, 255), (255, 255, 255)]
    [(255, 255, 255), (255, 255, 255)]
out:
    [(255, 255, 255)]

in:
    [(100, 100, 100), (255, 255, 255)]
    [(255, 255, 255), (255, 255, 255)]
out:
    [(255, 255, 255)]

Cheers!

answered Feb 9, 2018 at 7:12

Matt Rundle

911 silver badge2 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

AC1009 · Accepted Answer · 2017-11-14 02:29:54Z

3

Not sure how big your arrays will get, but if they remain fairly small, this could work:

import numpy as np

arr1 = np.array([(255, 255, 255), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
intersectedArr = []

for a1, a2 in zip(arr1, arr2):
    if np.array_equal(a1, a2):
        intersectedArr.append(a1)
print(np.array(intersectedArr))

arr1 = np.array([(100, 100, 100), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
intersectedArr = []

for a1, a2 in zip(arr1, arr2):
    if np.array_equal(a1, a2):
        intersectedArr.append(a1)
print(np.array(intersectedArr))

answered Nov 14, 2017 at 2:29

AC1009

643 bronze badges

Comments

f5r5e5d · Accepted Answer · 2017-11-14 06:37:14Z

3

how about a numpy answer?

import numpy as np


arr1 = np.array([(255, 255, 255), (255, 255, 25)])  # changed some to 25
arr2 = np.array([(255, 25, 255), (255, 255, 25)])

arr1[np.where(arr1==arr2)]

array([255, 255, 255, 255,  25])

2nd example

arr1 = np.array([(100, 100, 100), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])

arr1[np.where(arr1==arr2)]

array([255, 255, 255])

edited Nov 14, 2017 at 6:37

answered Nov 14, 2017 at 6:23

f5r5e5d

3,7413 gold badges17 silver badges19 bronze badges

1 Comment

Xiaojian Chen Over a year ago

This one is best for me, since you don't need to write iteration.

Ken Y-N · Accepted Answer · 2017-11-14 02:35:56Z

NOTE: This assumes [a, b, c] and [b, c, a] gives [a, b, c], that is the order of elements is ignored.

OK, I've done a little experimenting and this might be what you are after. Given:

arr1a = np.array([(255, 255, 255), (255, 255, 255)])
arr1b = np.array([(100, 100, 100), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])

Then we can find an intersection with:

np.array([item in arr2 for item in arr1a])

ie, for each element in arr1a, check to see it appears in arr2 also. This gives a result of:

>>> array([ True,  True], dtype=bool)

Similarly:

np.array([item in arr2 for item in arr1b])
>>> array([False,  True], dtype=bool)

Now, we can use this result to pick the common values from the original lists:

mask = np.array([item in arr2 for item in arr1a])
arr1a[mask]
>>> array([[255, 255, 255],
           [255, 255, 255]])

And:

mask = np.array([item in arr2 for item in arr1b])
arr1b[mask]
>>> array([[255, 255, 255]])

Andy Hayden · Accepted Answer · 2017-11-14 06:49:33Z

For larger arrays it might help to use pandas' groupby and cumcount:

In [11]: df1 = pd.DataFrame(arr1)

In [12]: df1["cumcount"] = df1.groupby([0, 1, 2]).cumcount()

In [13]: df1
Out[13]:
     0    1    2  cumcount
0  100  100  100         0
1  255  255  255         0

In [14]: df2 = pd.DataFrame(arr2)

In [15]: df2["cumcount"] = df2.groupby([0, 1, 2]).cumcount()

In [16]: df2
Out[16]:
     0    1    2  cumcount
0  255  255  255         0
1  255  255  255         1

Now a merge gets you the array you desire:

In [21]: df1.merge(df1).iloc[:, :3].values
Out[21]:
array([[100, 100, 100],
       [255, 255, 255]])

In [22]: df1.merge(df2).iloc[:, :3].values
Out[22]: array([[255, 255, 255]])

In [23]: df2.merge(df2).iloc[:, :3].values
Out[23]:
array([[255, 255, 255],
       [255, 255, 255]])

active_VTA · Accepted Answer · 2021-09-06 07:15:02Z

0

In your case, you want to compare against rows instead of elements, so it`s a matter of 2D array. I would recommend an improvement of intersect1d which is intersection of 2D numpy arrays. I found a good solution here Intersection of 2D numpy ndarrays.

def multidim_intersect(arr1, arr2):
    arr1_view = arr1.view([('',arr1.dtype)]*arr1.shape[1])
    arr2_view = arr2.view([('',arr2.dtype)]*arr2.shape[1])
    intersected = numpy.intersect1d(arr1_view, arr2_view)
    return intersected.view(arr1.dtype).reshape(-1, arr1.shape[1])

The code above convert the shape of the original array, combine them row-wise, and then convert it back to 2-dim shape.

edited Sep 6, 2021 at 7:15

answered Sep 1, 2021 at 6:55

active_VTA

112 bronze badges

2 Comments

Community Over a year ago

Please add further details to expand on your answer, such as working code or documentation citations.

NKSM Over a year ago

While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes.

Collectives™ on Stack Overflow

Python - return intersection of two arrays

6 Answers 6

Comments

Comments

1 Comment

Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

Comments

1 Comment

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related