3

I have an OSM PBF file which I am trying to parse. The format standard states, and reading it in Sublime Text this is confirmed, that the first four bytes are:

0000 000d

Why then, if I run a very simple Python program:

PBFfile = open(r'MyFilePath.osm.pbf')
PBFfile.read(4)[3].encode('hex')

does it return 0a (the next byte in the sequence) not the expected 0d? Is there an obvious explanation?

I am on Windows 7, Python 2.7.5 32 bit.

1
  • 1
    On Windows '\r' is stripped from text file records. Open the file in binary mode open(filename, 'rb') Commented Feb 20, 2015 at 11:04

1 Answer 1

2

You're opening the file in "text mode", which causes some unwanted newline handling (docs).

To solve your problem, open it in binary mode, like:

PBFfile = open(r'MyFilePath.osm.pbf', 'rb')
Sign up to request clarification or add additional context in comments.

1 Comment

perfect, thank you. I never realised it was so important. Guess it's a Windows thing :(

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.