0

I have some file with little-endian encoding bytes in it, I want to take N bytes, specify endianess and convert them into a decimal number using python (any version). How to do it correctly?

3
  • You can probably use the struct module. How big is N? Commented Aug 17, 2016 at 9:04
  • Possible duplicate of Endianness of integers in Python Commented Aug 17, 2016 at 9:07
  • N can be up to file size. Commented Aug 17, 2016 at 9:35

3 Answers 3

4

In Python 3 you can use something like this:

int.from_bytes(byte_string, byteorder='little')
Sign up to request clarification or add additional context in comments.

4 Comments

Ooooo I did not know that. +1
Ok, we have a solution for Python 3. How to do this for Python 2?
Using struct as @juanpa.arrivillaga has mentioned
@warchantua: I've posted some Python 2 code. It's not as pretty as the Python 3 version, but it works. :)
2

As Harshad Mulmuley' answer shows, this is easy in Python 3, using the int.from_bytes method. In Python 2, it's a little trickier.

The struct module is designed to handle standard C data types. It won't handle arbitrary length integers (Python 2 long integers), as these are not native to C. But you can convert them using a simple for loop. I expect that this will be significantly slower than the Python 3 way, since Python for loops are slower than looping at C speed, like int.from_bytes (probably) does.

from binascii import hexlify

def int_from_bytes_LE(s):
    total = 0
    for c in reversed(s):
        total = (total << 8) + ord(c)
    return total

# Test

data = (
    (b'\x01\x02\x03\x04', 0x04030201),
    (b'\x01\x02\x03\x04\x05\x06\x07\x08', 0x0807060504030201),
    (b'\x01\x23\x45\x67\x89\xab\xcd\xef\x01\x23\x45\x67\x89\xab\xcd\xef', 
        0xefcdab8967452301efcdab8967452301),
)

for s, u in data:
    print hexlify(s), u, int_from_bytes_LE(s)
    #print(hexlify(s), u, int.from_bytes(s, 'little'))

output

01020304 67305985 67305985
0102030405060708 578437695752307201 578437695752307201
0123456789abcdef0123456789abcdef 318753391026855559389420636404904698625 318753391026855559389420636404904698625

(I put that Python 3 print call in there so you can easily verify that my function gives the same result as int.from_bytes).

If your data is really large and you don't want to waste RAM reversing your byte string you can do it this way:

def int_from_bytes_LE(s):
    m = 1
    total = 0
    for c in s:
        total += m * ord(c)
        m <<= 8
    return total

Of course, that uses some RAM for m, but it won't be as much as the RAM used for reversing the input string.

Comments

0

Using Python 3 (or 2), you can achieve this with the struct library.

with open('blob.dat', 'rb') as f:
    data = f.read(n)

Now, you unpack using the appropriate format specifier string. For example, big-endian int:

num = struct.unpack(">i",data)

2 Comments

struct is available in Python2 as well, isn't it?
@VPfB Yes. See the docs.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.