Python3 ASCII Hexadecimal to Binary String Conversion

Question

I'm using Python 3.2.3 on Windows, and am trying to convert binary data within a C-style ASCII file into its binary equivalent for later parsing using the struct module. For example, my input file contains "0x000A 0x000B 0x000C 0x000D", and I'd like to convert it into "\x00\x0a\x00\x0b\x00\x0c\x00\x0d".

The problem I'm running into is that the string datatypes have changed in Python 3, and the built-in functions to convert from hexadecimal to binary, such as binascii.unhexlify(), no longer accept regular unicode strings, but only byte strings. This process of converting from unicode strings to byte strings and back is confusing me, so I'm wondering if there's an easier way to achieve this. Below is what I have so far:

with open(path, "r") as f:
    l = []
    data = f.read()
    values = data.split(" ")

    for v in values:
            if (v.startswith("0x")):
                    l.append(binascii.unhexlify(bytes(v[2:], "utf-8").decode("utf-8")

    string = ''.join(l)

No, I haven't tried to open the file as binary. My line of thought was that the input file uses quasi-C syntax, so then not only would I need to filter out comments and separators between hexadecimal numbers, but also perform the hexadecimal to binary conversion at the same time, which could get tricky. This is why I ended up opening it in ASCII mode and splitting it into a list based on the space delimiter, because then I could easily loop through and exclude anything that doesn't start with "0x". — ddcc
– ddcc, Commented Oct 7, 2012 at 4:07

Ignacio Vazquez-Abrams · Accepted Answer · 2012-10-07 04:13:04Z

1

3>> ''.join(chr(int(x, 16)) for x in "0x000A 0x000B 0x000C 0x000D".split()).encode('utf-16be')
b'\x00\n\x00\x0b\x00\x0c\x00\r'

answered Oct 7, 2012 at 4:13

Ignacio Vazquez-Abrams

804k160 gold badges1.4k silver badges1.4k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

kampu · Accepted Answer · 2012-10-07 04:08:50Z

1

As agf says, opening the image with mode 'r' will give you string data. Since the only thing you are doing here is looking at binary data, you probably want to open with 'rb' mode and make your result of type bytes, not str.

Something like:

with open(path, "rb") as f:
    l = []
    data = f.read()
    values = data.split(b" ")

    for v in values:
            if (v.startswith(b"0x")):
                    l.append(binascii.unhexlify(v[2:]))

    result = b''.join(l)

answered Oct 7, 2012 at 4:08

kampu

1,4212 gold badges11 silver badges15 bronze badges

Collectives™ on Stack Overflow

Python3 ASCII Hexadecimal to Binary String Conversion

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related