0

I have a pandas dataframe with GPS points that looks like this:

    import pandas as pd
    d = {'user': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'], 'lat': [ 37.75243634842733, 37.75344580658182, 37.75405656449232, 37.753649393112181,37.75409897804892, 37.753937806404586, 37.72767062183685, 37.72710631810977, 37.72605407110467, 37.71141865080228, 37.712199505873926, 37.713285899241896, 37.71428740401767, 37.712810604103346], 'lon': [-122.41924881935118, -122.42006421089171, -122.419216632843, -122.41784334182738, -122.4169099330902, -122.41549372673035, -122.3878937959671, -122.3884356021881, -122.38841414451599, -122.44688630104064, -122.44474053382874, -122.44361400604248, -122.44260549545288, -122.44156479835509]}
    df = pd.DataFrame(data=d)
    

    user    lat         lon
0   A       37.752436   -122.419249
1   A       37.753446   -122.420064
2   A       37.754057   -122.419217
3   A       37.753649   -122.417843
4   A       37.754099   -122.416910
5   A       37.753938   -122.415494
6   B       37.727671   -122.387894
7   B       37.727106   -122.388436
8   B       37.726054   -122.388414
9   C       37.711419   -122.446886
10  C       37.712200   -122.444741
11  C       37.713286   -122.443614
12  C       37.714287   -122.442605
13  C       37.712811   -122.441565

Using the function below I can feed all these coordinates from the df directly to the (OSRM) request to map match these GPS points

import numpy as np
from typing import Dict, Any, List, Tuple
import requests
# Format NumPy array of (lat, lon) coordinates into a concatenated string formatted for OSRM server
def format_coords(coords: np.ndarray) -> str:
    coords = ";".join([f"{lon:f},{lat:f}" for lat, lon in coords])
    return coords

# Forward request to the OSRM server and return a dictionary of the JSON response.
def make_request(
        coords: np.ndarray,
    ) -> Dict[str, Any]:
    coords = format_coords(coords)
    url = f"http://router.project-osrm.org/match/v1/car/{coords}"
    r = requests.get(url)
    return r.json()

coords=df[['lat','lon']].values    

# Make request against the OSRM HTTP server
output = make_request(coords)

However, since the df consists of different GPS trajectories generated by different users, I want to write a function that loops through this dataframe and feeds the corresponding set of coordinates to the request per user group and not all at once. What is the best way to do this?

1 Answer 1

1

You can groupby the dataframe on user column, then apply make_request to each group, and save outputs to output dict (having user as key):

output = {}
for user, g in df.groupby('user'):
    output[user] = make_request(g[['lat', 'lon']].values)
Sign up to request clarification or add additional context in comments.

3 Comments

This is exactly what I was looking for. Thanks a lot!
Last question: if my pandas dataframe also would have a date column, populated with dates like so: '2018-02-03', what is then the best way to make the request grouped per user and day? Running output = {} for user, g in df.groupby(['date', 'user']): output[user] = make_request(g[['lat', 'lon']].values) gives me an invalid JSON object. Any ideas?
It should work to get the data into the dict, you would just get tuples for dict keys, like ('2021-01-01', 'A'), ('2021-01-01', 'B'), .... Are you trying to convert this dict to JSON? What is specifically the command that fails?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.