Calculate euclidean distance from scratch between 3 numpy arrays

Question

I have 3 huge numpy arrays, and i want to build a function that computes the euclidean distance pairwise from the points of one array to the points of the second and third array.

For the sake of simplicity suppose i have these 3 arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

c = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

I have tried this:

def correlation(x, y, t):
    from math import sqrt

    for a,b, in zip(x,y,t):
        distance = sqrt((x[a]-x[b])**2 + (y[a]-y[b])**2 + (t[a]-t[b])**2 )
    return distance

But this code throws an error: ValueError: too many values to unpack (expected 2)

How can i correctly implement this function using numpy or base python?

Thanks in advance

Hi Miguel, you aware that even if the syntax error is eliminated, your distance will always be 0, since x[a]-x[a]=0, y[b]-y[b]=0 and t[c]-t[c]=0? So I suggest rewriting the funciton def. — zabop
– zabop, Commented Feb 25, 2019 at 10:22
Hi @zabop thank you for your suggestion, i edited the function, i think it makes more sense now — Miguel 2488
– Miguel 2488, Commented Feb 25, 2019 at 10:25
Would you clarify what do you want to represent by x[a] and x[b]? The value inside the [] must be indices of the array x, but they are not in this current form of the function. — zabop
– zabop, Commented Feb 25, 2019 at 10:32
well, it's just the application of the euclidean distance formula. It should be sqrt((x sub2 - xsub1)**2 + (ysub2-ysub1)**2) — Miguel 2488
– Miguel 2488, Commented Feb 25, 2019 at 11:17

Redone R · Accepted Answer · 2019-03-01 20:58:38Z

1

First we define a function which computes the distance between every pair of rows of two matrices.

def pairwise_distance(f, s, keepdims=False):
    return np.sqrt(np.sum((f-s)**2, axis=1, keepdims=keepdims))

Second we define a function which calculate all possible distances between every pair of rows of the same matrix:

def all_distances(c):
    res = np.empty(shape=c.shape, dtype=float)
    for row in np.arange(c.shape[0]):
        res[row, :] = pairweis_distance(c[row], c) #using numpy broadcasting
    return res

Now we are done

row_distances = all_distances(a) #row wise distances of the matrix a
column_distances = all_distances(a) #column wise distances of the same matrix
row_distances[0,2] #distance between first and third row
row_distances[1,3] #distance between second and fourth row

edited Mar 1, 2019 at 20:58

answered Feb 25, 2019 at 21:45

Redone R

892 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Miguel 2488 Over a year ago

Hi @Redone R, thank you for your answer!! What i want is to create a matrix that takes the all the points of the initial matrix as rows, and also as columns, and the values that would fill that matrix, are the distances between one point and another, element wise. This means i need to calculate all the distances between all the points. For instance: Distance between the first point and the second, between the first and the third, the first and the fourth, later the second with the first

Miguel 2488 Over a year ago

the second with the third, the second with the fourth, and so on till i have the distance between each point in the matrix to each other point in the matrix

Redone R Over a year ago

please take a look again

zabop · Accepted Answer · 2019-02-25 10:16:58Z

0

Start with two arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

To calculate the distance between elements of these arrays you can do:

pairwise_dist_between_a_and_b=[(each**2+b[index]**2)**0.5 for index, each in enumerate(a)]

By doing so you get pairwise_dist_between_a_and_b:

[array([2.31931024e+00, 1.41421356e-03, 2.20617316e+00, 1.41421356e-01]),
 array([2.34193766e+00, 1.71119841e+00, 4.52548340e-01, 1.41421356e-04]),
 array([1.41449641e+00, 4.24264069e-04, 1.57119127e+00, 4.24264069e-04]),
 array([0.31536962, 0.94257334, 1.72831039, 2.3461803 ])]

You can use the same list comprehension for the first and third array.

answered Feb 25, 2019 at 10:16

zabop

8,1124 gold badges56 silver badges112 bronze badges

1 Comment

Miguel 2488 Over a year ago

Hi, and how can i achieve this using the 3 matrices?? do i have to calculate the distance from the resulting distance matrix to the 3rd one?? Thank you

Collectives™ on Stack Overflow

Calculate euclidean distance from scratch between 3 numpy arrays

2 Answers 2

3 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related