I created a cosine similarity method, which gives the correct results when called with indivdual vectors, but when I supply a list of vectors I suddenly get different results. Isn't numpy supposed to calculate the formula for every element in the list? Is my understanding wrong?
Cosine similarity:
def cosine_similarity(vec1, vec2):
return np.inner(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))
Example:
a = [1, 2, 3]
b = [4, 5, 6]
print(cosine_similarity(a, a), cosine_similarity(a, b), cosine_similarity(a, [a, b]))
With the result:
1.0 0.9746318461970762 [0.39223227 0.8965309 ]
The first two values are correct, the array of values should be the same, but isn't. Is this just not possible or do I have to change something?
np.linalg.norm(vec2)needs to be called with theaxisargument. When passing[a,b]into the norm function withoutaxis=-1it computes the norm of a 2x3 matrix instead of the norm of each vectornp.linalg.norm(vec2, axis=-1)works as you expected.