4

I have two arrays of different sizes containing 3d points. I would like to efficiently compare the two arrays and find the points that match and ultimately return a simple number of matching points.

pA=[[0,0,0],[0,1,0],[1,2,4],[10,3,4],[1,20,1],[5,3,2]]
pB=[[14,1,0],[1,2,4],[1,20,1],[15,1,0]]

#returns 2

Currently I have a sloppy loop that does the trick, but it is not very performance friendly which is a problem considering I am trying to match many pairs of arrays with a larger number of points

t= np.array([pA[x]==pB for x in range(len(pA))]).sum(2)
print np.sum(t==3)

I'm just not sure how to efficiently compare two multidimensional arrays of different sizes. And then how to do multiple iterations for a large number of pairs.

EDIT

Found a bit of a workaround which is pretty fast that combines the arrays, makes a unique version of the array, and then compares the lengths of the two arrays.

pts=np.concatenate((pA,pB),axis=0)
pts2 = np.unique(pts.view([('', pts.dtype)]*pts.shape[1]))
return len(pts)-len(pts2)
1
  • Could there be repetitions of points within each list? Commented Apr 8, 2015 at 18:00

2 Answers 2

3

No idea how this performs on your full dataset but try using Scipy's kdtree:

from scipy.spatial import cKDTree

pA=[[0,0,0],[0,1,0],[1,2,4],[10,3,4],[1,20,1],[5,3,2]]
pB=[[14,1,0],[1,2,4],[1,20,1],[15,1,0]]

kdtree = cKDTree(pA)
dists, inds = kdtree.query(pB, distance_upper_bound=1e-5)
result = (dists == 0).sum()
Sign up to request clarification or add additional context in comments.

7 Comments

Scipy definitely seems to have tools that would work well with this problem. Only issue is that this is for a script that is to run in Autodesk Maya and I would need to compile the Scipy module for this particular Maya version (something I have no experience in doing). Might not be too hard though if it is necessary.
I see. Does numpy with with Maya then?
Numpy I was able to find a compiled version for Maya. In my brief search I was unable to find a Scipy one. I may have to put more effort into getting Scipy working in Maya though since it seems to have some better functions for 3D from what little I have seen.
You might have luck with a pure-Python implementation like this: github.com/stefankoegl/kdtree . Or maybe there's another simple and efficient method. Worth mentioning your Maya constraint in the question though.
Seems like kdtree is quite useful when it comes to numpy. Good to know about it!
|
1

Here's one approach using just numpy operations. The basic idea here is that we concatenate those two lists into one numpy array. Then, we sort it row-wise to bring the matching points at consecutive rows. Next up, we do diff to get all zero rows for the matching ones, which is picked up by np.all(...==0,1). We count all those occurrences to give us the desired output of count of matching points between those two lists.

The implementation is listed below -

import numpy as np

# Inputs
pA=[[0,0,0],[0,1,0],[1,2,4],[10,3,4],[1,20,1],[5,3,2]]
pB=[[14,1,0],[1,2,4],[1,20,1],[15,1,0]]

# Form concatenate array of pA and pB
pts = np.concatenate((pA,pB),axis=0)

# Sort pts by rows
spts = pts[pts[:,1].argsort(),]

# Finally get counts by DIFFing along rows and counting all zero rows
counts = np.sum(np.diff(np.all(np.diff(spts,axis=0)==0,1)+0)==1)

Output -

In [152]: counts
Out[152]: 2

The above code works even if you have duplicate points in either of the lists. So, let's add some duplicate points in the inputs of the earlier code -

# Inputs
pA=[[0,0,0],[0,1,0],[1,2,4],[10,3,4],[1,20,1],[5,3,2],[1,2,4]]
pB=[[14,1,0],[1,2,4],[1,20,1],[15,1,0],[1,2,4]]

After code run with the modified inputs, the output still remains as 2 which is the expected output.

If you are sure that there are no duplicate entries in either of the lists, you can use a simplified version to replace the last step -

counts = np.sum(np.all(np.diff(spts,axis=0)==0,1))

3 Comments

This idea is essentially what I have in use at the moment, though I am sure yours is much more elegant. My variation concatenates the arrays and then creates a new array with all duplicate entries removed and compares the two array lengths to get the number. pts=np.concatenate((pA,pB),axis=0) pts2 = np.unique(pts.view([('', pts.dtype)]*pts.shape[1])) counts = len(pts)-len(pts2)
@AlexRideout Yes, that all + diff part is essentially unique finding in a raw version, and might be efficient as it also avoids intermediate variable creation. Added a simplified version at the end of the new edits to the solution for lists with no duplicate entries. Just curious, if you had a chance to test out these approaches for runtime performance?
Your approach definitely seems more elegant than my unique method. I haven't had the chance to do any official measurements as far as exact time goes. Mostly have only done subjective tests on the geometry in my Maya scenes. I am currently using a loop to iterate through all of the pairs of point arrays I have generated (perhaps some way to vectorize this step?) and this runs through roughly 6,000 pairs/second which is somewhat acceptable for my implementation. Though I have been dealing with upwards of 200,000 pairs, so not perfectly ideal.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.