1

I have 3 huge numpy arrays, and i want to build a function that computes the euclidean distance pairwise from the points of one array to the points of the second and third array.

For the sake of simplicity suppose i have these 3 arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

c = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

I have tried this:

def correlation(x, y, t):
    from math import sqrt

    for a,b, in zip(x,y,t):
        distance = sqrt((x[a]-x[b])**2 + (y[a]-y[b])**2 + (t[a]-t[b])**2 )
    return distance

But this code throws an error: ValueError: too many values to unpack (expected 2)

How can i correctly implement this function using numpy or base python?

Thanks in advance

6
  • 1
    Hi Miguel, you aware that even if the syntax error is eliminated, your distance will always be 0, since x[a]-x[a]=0, y[b]-y[b]=0 and t[c]-t[c]=0? So I suggest rewriting the funciton def. Commented Feb 25, 2019 at 10:22
  • Hi @zabop thank you for your suggestion, i edited the function, i think it makes more sense now Commented Feb 25, 2019 at 10:25
  • Would you clarify what do you want to represent by x[a] and x[b]? The value inside the [] must be indices of the array x, but they are not in this current form of the function. Commented Feb 25, 2019 at 10:32
  • well, it's just the application of the euclidean distance formula. It should be sqrt((x sub2 - xsub1)**2 + (ysub2-ysub1)**2) Commented Feb 25, 2019 at 11:17
  • And what is sub2 and sub1? Commented Feb 25, 2019 at 11:19

2 Answers 2

1

First we define a function which computes the distance between every pair of rows of two matrices.

def pairwise_distance(f, s, keepdims=False):
    return np.sqrt(np.sum((f-s)**2, axis=1, keepdims=keepdims))

Second we define a function which calculate all possible distances between every pair of rows of the same matrix:

def all_distances(c):
    res = np.empty(shape=c.shape, dtype=float)
    for row in np.arange(c.shape[0]):
        res[row, :] = pairweis_distance(c[row], c) #using numpy broadcasting
    return res

Now we are done

row_distances = all_distances(a) #row wise distances of the matrix a
column_distances = all_distances(a) #column wise distances of the same matrix
row_distances[0,2] #distance between first and third row
row_distances[1,3] #distance between second and fourth row
Sign up to request clarification or add additional context in comments.

3 Comments

Hi @Redone R, thank you for your answer!! What i want is to create a matrix that takes the all the points of the initial matrix as rows, and also as columns, and the values that would fill that matrix, are the distances between one point and another, element wise. This means i need to calculate all the distances between all the points. For instance: Distance between the first point and the second, between the first and the third, the first and the fourth, later the second with the first
the second with the third, the second with the fourth, and so on till i have the distance between each point in the matrix to each other point in the matrix
please take a look again
0

Start with two arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

To calculate the distance between elements of these arrays you can do:

pairwise_dist_between_a_and_b=[(each**2+b[index]**2)**0.5 for index, each in enumerate(a)]

By doing so you get pairwise_dist_between_a_and_b:

[array([2.31931024e+00, 1.41421356e-03, 2.20617316e+00, 1.41421356e-01]),
 array([2.34193766e+00, 1.71119841e+00, 4.52548340e-01, 1.41421356e-04]),
 array([1.41449641e+00, 4.24264069e-04, 1.57119127e+00, 4.24264069e-04]),
 array([0.31536962, 0.94257334, 1.72831039, 2.3461803 ])]

You can use the same list comprehension for the first and third array.

1 Comment

Hi, and how can i achieve this using the 3 matrices?? do i have to calculate the distance from the resulting distance matrix to the 3rd one?? Thank you

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.