I need my code to work both on Linux and Windows. I have a binary file which contains a text header with Date and Time information in it which I'd like to extract. An example of the extracted part (ie. how the information is saved in the txt header) is in the commented part of the code. The entire code is written in Python so I'd like to have this extraction also done in Python. In Linux, I'd simply use subprocess and grep (ref):
import subprocess
hosts = subprocess.check_output("grep -E -a 'Date' /path/Bckgrnd.bip", shell=True)
sentence = hosts.decode('utf-8')
# '---------------------------- Date:09/09/2020 Time:11:26:19 ----------------------------\n Capture Time/Date:\t11:26:17 on 09/09/2020\n---------------------------- Date:09/09/2020 Time:11:26:19 ----------------------------\n'
date = sentence[sentence.index('Date:')+5:sentence.index('Date:')+13]
time = sentence[sentence.index('Time:')+5:sentence.index('Time:')+13]
print(date, time)
# 09/09/20 11:26:19
The problem is that this is going to fail on Windows. An alternative is to load the file in Python:
file_input = /path/Bckgrnd.bip
with open(file_input, 'rb') as f:
s = f.read()
print(s.find(b'Date'))
# 498
date = s[s.find(b'Date')+5:s.find(b'Date')+13].decode('utf-8')
time = s[s.find(b'Time')+5:s.find(b'Time')+13].decode('utf-8')
print(date, time)
That has one main issues. It has to read the entire file into memory and if the file is large, that is a problem. Is there a way how to go around the OS issues with grep? Is there an alternative to it in pure python without loading the entire binary?
Update:
Regarding speed -- I believe grep is faster than pure Python so having it there would make it not only memory-wise but also speed-wise better.
Notice that even grep is treating is as a binary (the -a tag as mentioned eg. here).