0

I can't find an efficient way to conduct Matlab's "ismember(a,b,'rows')" with Python where a and b are arrays of size (ma,2) and (mb,2) respectively and m is the number of couples.

The ismember module (https://pypi.org/project/ismember/) crashes because at some point i.e. when doing np.all(a[:, None] == b, axis=2).any(axis=1) it needs to create an array of size (ma,mb,2) and it is too big. Moreover, even when the function works (because arrays are small enough), it is about a 100times slower than in Matlab. I guess it is because Matlab uses a built-in mex function. Why python does not have what I would think to be such an important function ? I use it countless times in my calculations...

ps : the solution proposed here Python version of ismember with 'rows' and index does not correspond to the true matlab's ismember function since it does not work element by element i.e. it does not verify that a couple of values of 'a' exists in 'b' but only if values of each column of 'a' exist in each columns of 'b'.

0

1 Answer 1

2

You can use np.unique(array,axis=0) in order to find the identical row of an array. So with this function you can simplify your 2D problem to a 1D problem which can be easily solve with np.isin():

import numpy as np

# Dummy example array:
a = np.array([[1,2],[3,4]])
b = np.array([[3,5],[2,3],[3,4]])

# ismember_row function, which rows of a are in b:
def ismember_row(a,b):
    # Get the unique row index
    _, rev = np.unique(np.concatenate((b,a)),axis=0,return_inverse=True)
    # Split the index
    a_rev = rev[len(b):]
    b_rev = rev[:len(b)]
    # Return the result:
    return np.isin(a_rev,b_rev)

res = ismember_row(a,b)
# res = array([False,  True])
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you ! 2160 times faster than the solution proposed in Pypi...you saved my day. For users interested in having the complete ismember function with return of reversed indices you just need to implement the code of ismember_row here instead of that of the _is_in_row function of the pypi/ismember module
As you were the one coming with the solution, would you mind adding it on the github of pypi/ismember here github.com/erdogant/ismember/blob/master/ismember/ismember.py . All you need to do is replace the code of the function is_row_in(a, b) with the code of your function ismember_row . This will be a great use as your elegant solution allows to process very large arrays in a very short time.
It seems that you already send a pull request that have been accepted. Perfect.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.