Using the following data in train_data_sample and the below code, how can I iterate through each index latitude and longitude? (see below for wished results)
latitude longitude price
0 55.6632 12.6288 2595000
1 55.6637 12.6291 2850000
2 55.6637 12.6291 2850000
3 55.6632 12.6290 3198000
4 55.6632 12.6290 2995000
5 55.6638 12.6294 2395000
6 55.6637 12.6291 2995000
7 55.6642 12.6285 4495000
8 55.6632 12.6285 3998000
9 55.6638 12.6294 3975000
from numpy import cos, sin, arcsin, sqrt
from math import radians
def haversine(row):
for index in train_data_sample.index:
lon1 = train_data_sample["longitude"].loc[train_data_sample.index==index]
lat1 = train_data_sample["latitude"].loc[train_data_sample.index==index]
lon2 = row['longitude']
lat2 = row['latitude']
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * arcsin(sqrt(a))
km = 6367 * c
return km
def insert_dist(df):
df["distance"+str(index)] = df.apply(lambda row: haversine(row), axis=1)
return df
print(insert_dist(train_data_sample))
This is the result for index 0. It looks at the coordinates for index 0 versus every other row and returns the distance in meters. So the distance between coordinates for index 0 and 1 are ~50 meters.
latitude longitude price distance0
0 55.6632 12.6288 2595000 0.000000
1 55.6637 12.6291 2850000 0.058658
2 55.6637 12.6291 2850000 0.058658
3 55.6632 12.6290 3198000 0.012536
4 55.6632 12.6290 2995000 0.012536
5 55.6638 12.6294 2395000 0.076550
6 55.6637 12.6291 2995000 0.058658
7 55.6642 12.6285 4495000 0.112705
8 55.6632 12.6285 3998000 0.018804
9 55.6638 12.6294 3975000 0.076550
The end result should return not only distance0, but also distance1, distance2, etc.