2

I'm porting a long, ugly function from JS to Python that basically calculates some kind of hash string given some input parameters. After porting and adapting the code, did some testing, and (suprise surprise), I don't get the same result.

I did some debugging and got to the line that starts to mess up everything, turns out to be a XOR operation. So, long story short, I have isolated a simple example which shows how with the same values, a different result is obtained.

This is JS code:

hex_str = "0xA867DF55"
crc = -1349196347
new_crc = (crc >> 8) ^ hex_str
//new_crc == 1472744368

This is the same code in Python:

hex_str = "0xA867DF55"
crc = -1349196347
new_crc = (crc >> 8) ^ int(hex_str, 16)
//new_crc == -2822222928

The only difference is that hex_str gets explicitly converted to an integer in Python.

In the actual code, this calculation runs within a for loop. hex_str and crc get updated on each iteration. On the first few iterations everything works fine in python, but when hex_str and crc get the values shown above, everything starts to mess up.

1 Answer 1

7

The difference is in how signed numbers are treated. Python treats integers to have arbitrary bit length in all contexts. For bit manipulations, a negative number is treated to have "enough" leading one bits for any purpose, so XORing a negative number with a positive one will always result in a negative number. In JavaScript on the other hand, integers in bit operations are treated as signed 32 bit numbers, so the result may be different.

The CRC32 is computed using 32 bit integers. To simulate the behaviour in Python, you can limit all operations to 32 bits by taking the lower 32 bits of any result:

>>> -2822222928 & (2 ** 32 - 1)
1472744368

or applied to your code

hex_str = "0xA867DF55"
crc = -1349196347
new_crc = ((crc >> 8) ^ int(hex_str, 16)) & (2 ** 32 - 1)
Sign up to request clarification or add additional context in comments.

2 Comments

Well, I tried your solution and even though at first it worked for the values of my original question, with other combinations i still got odd results. I finally ended using ctypes.c_int function which gave me more solid results. I don't know why your solution of applying that 32 bit mask doesn't always work, but anyways, thanks for giving me some hints!
@KilianPerdomo You might still get the wrong sign for the result, so you would have to adjust the sign if the result is bigger than 2**31 - 1. Using ctypes.c_int is probably the easier solution, though I'm not sure it's guaranteed to have 32 bits. You can go with ctypes.c_int32 to be safe.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.