I have an interesting problem but cannot find the relevant literature, so need algorithem suggestions or just the correct way to phrase the problem so I can find the literature.
I'm given a vector of floats of varying length coming from a 2D array. I know the orientation (i.e. which axis I'm looking in). I need to find the start index of the vector. Essentially, I'm cross-correlating a short vector with a long vector.
I have implemented a brute force approach but naturally it grows as O(n^2) where n is the dimension of the 2D array.
Q1) What is an efficient algorithem to tackel this problem?
Q2) How is this type of problem called (so I can find relevant papers)?
There is measurement error so it will never be an exact match.
Here the brute force approach, where I look for the minimum of the norm of two vectors:
import time
import numpy as np
def get_distance(a: np.ndarray, b: np.ndarray) -> float:
""" Calculate the distance between two vectors. """
return np.linalg.norm(a - b)
def find_min_distance_subvector(array: np.ndarray, x: np.ndarray) -> tuple[int, float]:
leng = len(x)
min_index = 0
min_distance = np.inf
# Assuming we know the orientation of the vector
for ii in range(0, len(array) - leng):
# Extract a sub-vector of the same length as x, starting from index ii
sub_vec = array[ii:ii + leng]
dist = get_distance(sub_vec, x)
if dist < min_distance:
min_distance = dist
min_index = ii
return min_index, min_distance
def main():
leng = 100
size = 2000
# Create the search map
arr = np.random.random((size, size))
# Pick a random sub-vector from the map
index = np.random.randint(0, size - leng)
x = arr[index:index + leng]
start_time = time.time()
min_index, min_metric = find_min_distance_subvector(arr, x)
end_time = time.time()
print(f"Minimum distance: {min_metric} at index {min_index}, correct index: {index}, "
f"time taken: {end_time - start_time:.4f} seconds")
if __name__ == '__main__':
main()
Thank you for your help
lengrows on all (size) columns.