I am working with the IRIS dataset. I have two sets of data, (1 training set) (2 test set). Now I want to calculate the euclidean distance between every test set row and the train set rows. However, I only want to include the first 4 points of the row.
A working example would be:
dist = np.linalg.norm(inner1test[0][0:4]-inner1train[0][0:4])
print(dist)
***output: 3.034243***
The problem is that I have 120 training set points and 30 test set points - so i would have to do 2700 operations manually, thus I thought about iterating through with a for-loop. Unfortunately, every of my attemps is failing.
This would be my best attempt, which shows the error message
for i in inner1test:
for number in inner1train:
dist = np.linalg.norm(inner1test[i][0:4]-inner1train[number][0:4])
print(dist)
(IndexError: arrays used as indices must be of integer (or boolean) type)
What would be the best solution to iterate through this array?
ps: I will also provide a screenshot for better vizualisation.
