5
import numpy as np

a=np.random.randint(0,200,100)#rand int array
b1=np.random.randint(0,100,50)
b2=b1**3
c=[]

I have a problem I think should be easy but can't find solution, I want to find the matching values in two arrays, then use the indices of one of these to find values in another array

for i in range(len(a)):
    for j in range(len(b1)):
         if b1[j]==a[i]:
             c.append(b2[j])

c=np.asarray(c)

Clearly the above method does work, but it's very slow, and this is just an example, in the work I'm actually do a,b1,b2 are all over 10,000 elements.

Any faster solutions?

4
  • 4
    Any faster solutions? codereview is better suited for this Commented Apr 14, 2015 at 15:33
  • @BhargavRao I thinks also stack could be used for short codes ;) Commented Apr 14, 2015 at 15:40
  • 1
    I agree @Kasra but codereview is the better site Commented Apr 14, 2015 at 15:40
  • @BhargavRao Yeah,sure! Commented Apr 14, 2015 at 15:43

2 Answers 2

6

np.in1d(b1, a) returns a boolean array indicating whether each element of b1 is found in a.

If you wanted to get the values in b2 which corresponded to the indices of common values in a and b1, you could use the boolean array to index b2:

b2[np.in1d(b1, a)]

Using this function should be a lot faster as the for loops are pushed down to the level of NumPy's internal routines.

Sign up to request clarification or add additional context in comments.

Comments

1

You can use numpy.intersect1d to get the intersection between 1d arrays.Note that when you can find the intersection then you don't need the indices or use them to find themselves again!!!

>>> a=np.random.randint(0,200,100)
>>> b1=np.random.randint(0,100,50)
>>> 
>>> np.intersect1d(b1,a)
array([ 3,  9, 17, 19, 22, 23, 37, 53, 55, 58, 67, 85, 93, 94])

You may note that using intersection is a more efficient way as for a[np.in1d(a, b1)] in addition of calling in1d function python is forced to do an extra indexing,for better understanding see the following benchmark :

import numpy as np
s1="""
import numpy as np
a=np.random.randint(0,200,100)
b1=np.random.randint(0,100,50)
np.intersect1d(b1,a)
"""
s2="""
import numpy as np
a=np.random.randint(0,200,100)
b1=np.random.randint(0,100,50)
a[np.in1d(a, b1)]
    """


print ' first: ' ,timeit(stmt=s1, number=100000)
print 'second : ',timeit(stmt=s2, number=100000)

Result:

 first:  3.69082999229
second :  7.77609300613

9 Comments

This only gives the intersection not the indices.
I still think you are missing the second part of the question: "I want to find the matching values in two arrays, then use the indices of one of these to find values in another array"
@Kasra: but the OP doesn't just want the intersecting values, he specifically mentions getting the indices of the intersecting values.
I can think of a few. I'm guessing there is a typo in the original question. c.append(b2[j]) should be the last line in the for loop. This would pretty much indicate the OP is trying to look up a value in B2 that is mapped by index to B1.
Yes it's b2 values I'm looking for
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.