I have a pandas dataframe with GPS points that looks like this:
import pandas as pd
d = {'user': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'], 'lat': [ 37.75243634842733, 37.75344580658182, 37.75405656449232, 37.753649393112181,37.75409897804892, 37.753937806404586, 37.72767062183685, 37.72710631810977, 37.72605407110467, 37.71141865080228, 37.712199505873926, 37.713285899241896, 37.71428740401767, 37.712810604103346], 'lon': [-122.41924881935118, -122.42006421089171, -122.419216632843, -122.41784334182738, -122.4169099330902, -122.41549372673035, -122.3878937959671, -122.3884356021881, -122.38841414451599, -122.44688630104064, -122.44474053382874, -122.44361400604248, -122.44260549545288, -122.44156479835509]}
df = pd.DataFrame(data=d)
user lat lon
0 A 37.752436 -122.419249
1 A 37.753446 -122.420064
2 A 37.754057 -122.419217
3 A 37.753649 -122.417843
4 A 37.754099 -122.416910
5 A 37.753938 -122.415494
6 B 37.727671 -122.387894
7 B 37.727106 -122.388436
8 B 37.726054 -122.388414
9 C 37.711419 -122.446886
10 C 37.712200 -122.444741
11 C 37.713286 -122.443614
12 C 37.714287 -122.442605
13 C 37.712811 -122.441565
Using the function below I can feed all these coordinates from the df directly to the (OSRM) request to map match these GPS points
import numpy as np
from typing import Dict, Any, List, Tuple
import requests
# Format NumPy array of (lat, lon) coordinates into a concatenated string formatted for OSRM server
def format_coords(coords: np.ndarray) -> str:
coords = ";".join([f"{lon:f},{lat:f}" for lat, lon in coords])
return coords
# Forward request to the OSRM server and return a dictionary of the JSON response.
def make_request(
coords: np.ndarray,
) -> Dict[str, Any]:
coords = format_coords(coords)
url = f"http://router.project-osrm.org/match/v1/car/{coords}"
r = requests.get(url)
return r.json()
coords=df[['lat','lon']].values
# Make request against the OSRM HTTP server
output = make_request(coords)
However, since the df consists of different GPS trajectories generated by different users, I want to write a function that loops through this dataframe and feeds the corresponding set of coordinates to the request per user group and not all at once. What is the best way to do this?