I have two dataframes. Dataframe A is:
[distance] [measure]
17442.77000 32.792658
17442.95100 32.792658
17517.49200 37.648482
17518.29600 37.648482
17565.77600 38.287118
17565.88800 38.287118
17596.93700 41.203340
17597.29700 41.203340
17602.16400 41.477979
17602.83900 41.612774
17618.16400 42.479890
17618.71100 42.681591
and data frame B which is:
[mileage] [Driver]
17442.8 name1
17517.5 name2
17565.8 name3
17597.2 name4
17602.5 name5
17618.4 name6
for every [mileage] row in dataframe B, I'd like to find the two rows from [distance] in dataframe A in which A.loc[(A['distance']>= milage_value) & A['distance']<= mileage_value]so I could have something like this:
17442.77000 32.792658
17442.8 name1
17442.95100 32.792658
17517.49200 37.648482
17517.5 name2
17518.29600 37.648482
. .
. .
So I could apply the following function in a rolling window size of 3:
def f(x):
return df.iloc[0,1]+(df.iloc[2,1]-df.iloc[0,1])*((df.iloc[1,0]-df.iloc[0,0])/(df.iloc[2,0]-df.iloc[0,0]))
a = df.rolling(window=3, min_periods=1).apply(f)[::3].reset_index(drop=True)
So far I have been concatenating the two Dfs and sorting values to generate triplets like above, but the problem arise when two values from df B is within a distance range in A[distance]. Any hint/advice is much appreciated!