1

I am making requests to a public transportation API for data analysis. Several files are in JSON format, which is easy to deal with; however, some files are in .protobuf format.

I am curious how to parse these files into a human-readable format. For example, if I open the .protobuf file in a text editor, this is what I receive:

1"$
+(B���uC-(�̉�B
1462;
2"6

    741127020 *F
�ZB�����C-(�̉�B
1583<
3"7

    719255020 *10
K�B�8��C-FH@(�̉�B
1220<
4"7

Thanks!

1 Answer 1

2

Protobuffer is a binary format, so it's not human readable in it's raw state. To read it, go get the python bindings from Google and install with:

pip install --upgrade gtfs-realtime-bindings

Once you have those, you can download the pb file or read it locally very easily:

from google.transit import gtfs_realtime_pb2
import urllib.request 

feed = gtfs_realtime_pb2.FeedMessage()
pb_url = "http://someURL/someFile.pb"

with urllib.request.urlopen(pb_url) as response:
    feed.ParseFromString(response.read())
    print(feed)

This will give you something like:

header {
  gtfs_realtime_version: "1.0"
  incrementality: FULL_DATASET
  timestamp: 1579313685
}
entity {
  id: "10-abc-O-1"
  trip_update {
    trip {
      trip_id: "10-1622-O-1"
    }
...
Sign up to request clarification or add additional context in comments.

1 Comment

Awesome, thank you so much. NOTE:: take a look at the hyperlink Mark provided - you must instantiate feed like so: from google.transit import gtfs_realtime_pb2 import urllib feed = gtfs_realtime_pb2.FeedMessage()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.