0

I am reading a Hexadecimal binary file. I need to remove bytes after seek command to specific location. Below code is reading binary file. But i don't know how to remove 4 bytes in middle of file.

 import os
 import struct

 with open("esears36_short.dat", "rb") as f:
    data = f.read(2)
    number = struct.unpack(">h", data)[0]
    f.seek(number, 1)
    #need to code to remove 4 bytes

I need to execute this code in loop until EOF. Remove 4 bytes after every n bytes specfied in number field.

Value of number field in this case : 28045

Please help!

4
  • So do you want to delete the byte of the file in the position 28047 to 28051? Commented Apr 29, 2020 at 4:37
  • yes. Then seek 28045 bytes and delete bytes from 56094 to 56097 and so on Commented Apr 29, 2020 at 4:43
  • You want to move everything forward 4 bytes and thus make the file 4 bytes smaller? This is easier to do if you write a new smaller file. Commented Apr 29, 2020 at 4:44
  • Is it possible to edit in the same file itself because copying large file to another file multiple times will take lot of time. Commented Apr 29, 2020 at 4:49

1 Answer 1

1

To remove 4 bytes you have to copy the remaining file forward 4 bytes and that can be messy as you are reading and writing buffers in the same file. Its easier to write a new file and rename. In that case, you just seek ahead 4 bytes as needed.

import os
import struct

with open("esears36_short.dat", "rb") as f, open("esars32_short.dat.tmp", 'wb') as f_out:
    data = f.read(2)
    number = struct.unpack(">h", data)[0]
    f.seek(2, 1)
    while True:
        buf = f.read(number)
        if not buf:
            break
        f_out.write(buf)
        f.seek(4, 1) # 4 bytes forward
os.remove("esears36_short.dat")
os.rename("esars32_short.dat.tmp", "esears36_short.dat")

Although you are writing a new file you are doing less actual copying.

Sign up to request clarification or add additional context in comments.

6 Comments

Sorry If my question was not clear. I need to remove 4 bytes after every n bytes specfied in number field until EOF. In this case, 28047 to 28051 and 56094 to 56097 and so on.
@Arvinth - does the counting start at the beginning of the file or after number has been read? Should the first 2 bytes be in the output file? And number is only read once in the first 2 bytes?
Actually the counting starts beginning of file. first 2 bytes should not be in output file. number is present after every 28045 bytes. But number value stays the same.
Okay, current rev starts "number" writes starting from 2 then skips 4 til eof.
Not sure if that should have been 4 at the front or just the 2 for the number value.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.