3

I have two arrays, the first np.array is the points from, and the second np.array are all the distances I need to calculate.

Example:

 import numpy as np
 from_array = np.array([(0,1), (1,1), ..., (x,y)])
 to_array = np.array([(5,1), (3,1), ..., (x,y)])

What I need to do is take the first entry of the from_array and calculate all the distances between from_array[0] to all points in to_array, then keep the maximum distance.

So I can brute for this:

 def get_distances(from_array, to_array):
     results = []
     distances = []
     for pt in from_array:
        for to in to_array:
            results.append(calc_dist(pt, to))
        distances.append(results)
     return distances

But this is slow, I am looking for an optimized way of doing the calculation since I could have thousands of points.

The end goal is to calculate the Hausdorff Distance.

fhd = np.mean(np.min(SomeDistanceArray,axis=0))
rhd = np.mean(np.min(SomeDistanceArray,axis=1))
print (max(fhd, rhd))

I want to use numpy for this task only. My distance can either be euclidean or square euclidean distance.

So what I am looking help for is an optimized method for calculating the euclidean distance methods for two np.arrays. It should be noted that array 1 might have more rows than array 2. Means that the length of the 2D array (x,y) could be comparing 10 row to 30 rows.

3
  • Look into Scipy's cdist. Commented Dec 6, 2016 at 13:53
  • 1
    I feel like I just answered this yesterday Commented Dec 6, 2016 at 14:03
  • Scipy Version 0.19.0 (development version as of Jan2017) has an implementation of 'directed Hausdorff distance' you may want to look into. It uses an efficient algorithm (see paper) with a close to O(m) runtime (rather than O(m*n)). Commented Jan 10, 2017 at 21:00

2 Answers 2

6

Here's a NumPy based approach with np.einsum -

subs = from_array[:,None] - to_array
sq_eucliean_dist = np.einsum('ijk,ijk->ij',subs,subs)
eucliean_dist = np.sqrt(sq_eucliean_dist)

Note: If you are later computing np.mean(np.min(SomeDistanceArray,axis=0)), you can skip the computation for eucliean_dist and directly use sq_eucliean_dist as SomeDistanceArray, because computing square-root would be pretty expensive.


What does np.einsum('ijk,ijk->ij',subs,subs) do? It performs elementwise multiplication between the same array subs, i.e. essentially squaring and then does sum-reduction along the last axis, thus losing it in that reduction process.

So, why not explicitly do the squaring and summing? Well, the benefit with np.einsum is that it does both the squaring and summing in one step, giving us noticeable performance efficiency.

Thus, finally if from_array were (N x 2) array and to_array were (M x 2) array, the output from np.einsum would be the squared euclidean distances as a 2D array of shape (N x M). More info on the string notation itself would involve a longer discussion, some of which could be found in this post and the official documentation link posted earlier.

Sign up to request clarification or add additional context in comments.

2 Comments

can you explain the einsum function? I am not familiar with the notation ijk,ikj.>ij
@josh1234 Added few comments on it.
1

Numpy only. So no scipy.spatial.distance.cdist

First, don't use tuples, use 2xN and 2xM arrays. Then broadcast.

np.linalg.norm(from_array[:,:,None]-to_array[:,None,:], axis=0)

If you have an ancient version of numpy without a vecorizable linalg.norm (i.e. you're using Abaqus) do this:

np.sum((from_array[:,:,None]-to_array[:,None,:])**2, axis=0).__pow__(0.5)

8 Comments

yes no cdist :(. Does this work for uneven arrays? Say a 30x2 and a 50x2 array?
Yes, just make sure the common dimension is first, so 2x30 and 2x50, so you can broadcast as above
So I need to rotate my array from being column X,Y to X,Y as rows:
sorry, I mean, I need to rotate my array?
Transpose can be done by just taking the T attribute, e.g. x.T is the transpose of x.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.