I'm currently reading binary files that are 150,000 kb each. They contain roughly 3,000 structured binary messages and I'm trying to figure out the quickest way to process them. Out of each message, I only need to actually read about 30 lines of data. These messages have headers that allow me to jump to specific portions of the message and find the data I need.
I'm trying to figure out whether it's more efficient to unpack the entire message (50 kb each) and pull my data from the resulting tuple that includes a lot of data I don't actually need, or would it cost less to use seek to go to each line of data I need for every message and unpack each of those 30 lines? Alternatively, is this something better suited to mmap?