3

I am working on a script where it will breakdown another python script into blocks and using pycrypto to encrypt the blocks (all of this i have successfully done so far), now i am storing the encrypted blocks to a file so that the decrypter can read it and execute each block. The final result of the encryption is a list of binary outputs (something like blocks=[b'\xa1\r\xa594\x92z\xf8\x16\xaa',b'xfbI\xfdqx|\xcd\xdb\x1b\xb3',etc...]).

When writing the output to a file, they all end up into one giant line, so that when reading the file, all the bytes come back in one giant line, instead of each item from the original list. I also tried converting the bytes into a string, and adding a '\n' at the end of each one, but the problem there is that I still need the bytes, and I can't figure out how to undo the string to get the original byte.

To summarize this, i am looking to either: write each binary item to a separate line in a file so i can easily read the data and use it in the decryption, or i could translate the data to a string and in the decrpytion undo the string to get back the original binary data.

Here is the code for writing to the file:

    new_file = open('C:/Python34/testfile.txt','wb')
    for byte_item in byte_list:
        # This or for the string i just replaced wb with w and
        # byte_item with ascii(byte_item) + '\n'
        new_file.write(byte_item)
    new_file.close()

and for reading the file:

    # Or 'r' instead of 'rb' if using string method
    byte_list = open('C:/Python34/testfile.txt','rb').readlines()
2
  • 2
    "list of bytes that I want to store to a txt file" You store text in text files, and you store arbitrary bytes in a binary file. readlines() is for reading lines of text. Commented Aug 5, 2015 at 19:05
  • so what should my code look like instead? Commented Aug 5, 2015 at 19:28

3 Answers 3

3

A file is a stream of bytes without any implied structure. If you want to load a list of binary blobs then you should store some additional metadata to restore the structure e.g., you could use the netstring format:

#!/usr/bin/env python
blocks = [b'\xa1\r\xa594\x92z\xf8\x16\xaa', b'xfbI\xfdqx|\xcd\xdb\x1b\xb3']

# save blocks
with open('blocks.netstring', 'wb') as output_file:
    for blob in blocks:
        # [len]":"[string]","
        output_file.write(str(len(blob)).encode())
        output_file.write(b":")
        output_file.write(blob)
        output_file.write(b",")

Read them back:

#!/usr/bin/env python3
import re
from mmap import ACCESS_READ, mmap

blocks = []
match_size = re.compile(br'(\d+):').match
with open('blocks.netstring', 'rb') as file, \
     mmap(file.fileno(), 0, access=ACCESS_READ) as mm:
    position = 0
    for m in iter(lambda: match_size(mm, position), None):
        i, size = m.end(), int(m.group(1))
        blocks.append(mm[i:i + size])
        position = i + size + 1 # shift to the next netstring
print(blocks)

As an alternative, you could consider BSON format for your data or ascii armor format.

Sign up to request clarification or add additional context in comments.

Comments

0

I think what you're looking for is byte_list=open('C:/Python34/testfile.txt','rb').read()

If you know how many bytes each item is, you can use read(number_of_bytes) to process one item at a time.

read() will read the entire file, but then it is up to you to decode that entire list of bytes into their respective items.

Comments

0

In general, since you're using Python 3, you will be working with bytes objects (which are immutable) and/or bytearray objects (which are mutable).

Example:

b1 = bytearray('hello', 'utf-8')
print b1

b1 += bytearray(' goodbye', 'utf-8')
print b1

open('temp.bin', 'wb').write(b1)

#------

b2 = open('temp.bin', 'rb').read()
print b2

Output:

bytearray(b'hello')
bytearray(b'hello goodbye')
b'hello goodbye'

3 Comments

how exactly does this solve my problem? This will just gives me a giant byte, what i am looking for is on the receiving end to have a list of the bytes that went in (in your case, when i read/readline i would be able to easily derive [b'hello',b'goodbye'] )
"a giant byte" - Lol "byte" almost always means "octet" these days, exactly 8 bits. If you're actually dealing with binary data, it would be incredibly inefficient in Python to have a list of individual byte values, which is why I suggested using bytes and bytearray objects. You haven't explained what kind of data you're actually trying to store and recover, so it's difficult to give better advice - especially because you refer to both "bytes" (implying binary data) and strings of text.
ill edit my question and hopefully this will help you understand what i am looking for: what i am trying to acomplish is: i have a python script, i am breaking down the script into blocks and using pycrypto to encrypt the blocks (all of this i have successfully done so far), now i am storing the encrypted blocks to a file so that the decrypter can read it and execute each block. The final result of the encryption is a list of binary outputs

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.