0

I am writing a Huffman Coding program. So far I have only written the compression part: as expected it takes the text I want to compress, creates a code for each character and replaces each character with its respective code. This is my compressed text in a string format - I convert this string into a byte array using the following code:

def make_byte_array(self, padded_text):

        byte_array = bytearray()
        for i in range(0, len(padded_text), 8):
            byte_array.append(int(padded_text[i:i + 8], 2))
        
        return byte_array

I then save the byte_array into a .bin file by doing bytes(byte_array). I want to now be able to open this binary file, read the byte_array inside and turn it back into the string format of my compressed text in order to be able to decompress it. The problem is whenever I open and read this binary file, I get something like this:

b'\xad"\xfdK\xa8w\xc1\xec\xcb\xe5)\x1f\x1f\x92'

How would I go about converting this back into the string format of my compressed text?

1 Answer 1

1

If s is that byte string:

for x in s:
    print(f'{x:08b}')

Instead of print, you can do what you like with the strings of 0's and 1's.

It is unnecessarily inefficient to go through strings of 0 and 1 characters for encoding and decoding. You should instead assemble and disassemble the bytes directly using the bit operators (<<, >>, |, &).

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you! Silly question but how would the bit operators help me disassemble the bytes? Aren't they just AND and OR gates, and left and right shifts?
Yes, and that's all you need. Shift down by k bits, & with 1, and the result is the bit from position k.
Thanks again! One more thing - what does f'{x:08b}' mean exactly? What does the f'' do and does the 08b just split it into bits?
Isn't that what google is for? f means a formatted string. The thing in braces says to print the integer x in binary (the b). The 8 says to print up to 8 digits. The 0 says to always print eight digits, including leading zeros.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.