5

I am using Python. I am trying to determine the correct length of bytes in a binary set of data.

If I assign a variable the binary data...

x = "aabb".decode("hex")

is that the same as

x = b'aabb'

And if so, how do you get how many bytes that is? (It should be 2 bytes)

When I try:

len(x)

I get 4 instead of 2 though...

I am worried that x is turned into a string or something else I don't understand because the data types are so fluid in Python...

5
  • 2
    When I do x = "aabb".decode("hex"), then len(x) returns 2. Is that not what it returns on your machine? Commented Nov 21, 2016 at 17:46
  • 2
    "aabb".decode("hex") == b'aabb' returns False. Your assumption on these two forms being equal is wrong. Commented Nov 21, 2016 at 17:49
  • Kevin, you are correct, It is only when I do x = b'aabb' do I get the length of 4, so the question is I guess what does b'aabb' actually do? Commented Nov 21, 2016 at 17:57
  • 1
    In Python 2, b'aabb' is identical to 'aabb', but they're different in Python 3. Also, "aabb".decode("hex") in Python 3 raises AttributeError: 'str' object has no attribute 'decode'. And b"aabb".decode("hex") in Python 3 raises LookupError: 'hex' is not a text encoding; use codecs.decode() to handle arbitrary codecs Commented Nov 21, 2016 at 17:58
  • To convert 'aabb' to the bytestring b'\xaa\xbb' in a way that works in both versions you can do: import binascii;binascii.unhexlify("aabb") Commented Nov 21, 2016 at 18:04

1 Answer 1

20

The length of binary data is just the len, and the type is str in Python-2.x (or bytes in Python-3.x). However, your object 'aabb' does not contain the two bytes 0xaa and 0xbb, rather it contains 4 bytes corresponding with ASCII 'a' and 'b' characters:

>>> bytearray([0x61, 0x61, 0x62, 0x62])
bytearray(b'aabb')
>>> bytearray([0x61, 0x61, 0x62, 0x62]) == 'aabb'
True

This is probably the equivalence you were actually looking for:

>>> 'aabb'.decode('hex') == b'\xaa\xbb' 
True

The following items are all equal (and length 2):

>>> s1 = 'aabb'.decode('hex')
>>> s2 = b'\xaa\xbb'
>>> s3 = bytearray([0xaa, 0xbb])
>>> s4 = bytearray([170, 187])
>>> s1 == s2 == s3 == s4
True
Sign up to request clarification or add additional context in comments.

2 Comments

hmm.. would be good to answer the title though rather than the body text
@ElliotWoods I've edited to put it first and foremost.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.