0

Consider a file that contains binary data represented as bytes:

with open('foo', 'rb') as f:
    bs = f.read()
    print(bs)
    # b'\x00\x01\x00\x01\x00\x01'...

The bytes can only have either 0 or 1 values.

What is the most performant way to take a group of 32 bit/bytes and parse them into a (32-bit) integer? The struct module is probably what I need but I couldn't find an immediate way to do this.

Alternative methods that involve casting bytes into chars and then parsing the integer from a bitstring e.g. int('01010101...', 2) don't perform as fast as I need them to for my use case.

3
  • So you want to read 32 bits and interpret that as an int? Does the file only contain 32 bits or is it many 32-bit numbers? Commented Nov 2, 2017 at 19:16
  • The file contains many numbers if it matters for the solution Commented Nov 2, 2017 at 19:18
  • @YuvalAdam, did you try struct Commented Nov 2, 2017 at 19:21

1 Answer 1

2

Workaround Solutions

Considering the test number 101010...:

b = b'\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00'
print(0b10101010101010101010101010101010)
# 2863311530

Map bytes to string, then parse the int:

s = ''.join(map(lambda x: chr(x+48), b))
i = int(s, 2)
print(i)
# 2863311530

Iterate over the bytes and build the integer using bitshifts:

idx = 0
tmp = 0
for bit in b:
    tmp <<= 1
    tmp |= bit
    idx += 1
    if idx == 32:
        print(tmp)
        idx = 0
        tmp = 0
# 2863311530
Sign up to request clarification or add additional context in comments.

1 Comment

You said the int(..., 2) doesn't perform fast enough, but it's likely the map + lambda combination that's taking up the most time. Try s = ''.join([chr(x+48) for x in b]), that should give you a significant speedup.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.