15

Say that you have a string of bytes generated via os.urandom(24),

b'\x1b\xba\x94(\xae\xd0\xb2\xa6\xf2f\xf6\x1fI\xed\xbao$\xc6D\x08\xba\x81\x96v'

and you'd like to store that in an environment variable,

export FOO='\x1b\xba\x94(\xae\xd0\xb2\xa6\xf2f\xf6\x1fI\xed\xbao$\xc6D\x08\xba\x81\x96v'

and retrieve the value from within a Python program using os.environ.

foo = os.environ['FOO']

The problem is that, here, foo has the string literal value '\\x1b\\xba\\x94... instead of the byte sequence b'\x1b\xba\x94....

What is the proper export value to use, or means of using os.environ to treat FOO as a string of bytes?

3
  • Could be because of the single quotation marks. Commented Jun 11, 2017 at 2:28
  • I'm confused; if you print (repr) foo in Python where it came from something like os.urandom and see b'\x1b\xba...' then it is (in Python) raw bytes. If you read it from the envvar and see '\\x1b\\xba' then it's a (Unicode) string that's still escaped. As per this question, it seems like bash won't interpret your export FOO line as real binary, but a string with a bunch of \x's in it. Commented Sep 27, 2017 at 23:26
  • An alternative option is to save the bytes in a binary file, and use the filename as an environment variable Commented Apr 3, 2019 at 19:22

3 Answers 3

12

The easiest option is to simply set it as binary data in your shell. This uses ANSI string quoting and avoids the need for any sort of conversion on the Python side.

export FOO=$'\x1b\xba\x94(\xae\xd0\xb2\xa6\xf2f\xf6\x1fI\xed\xbao$\xc6D\x08\xba\x81\x96v'

NB: this type of string is not part of the POSIX specification at this time, but is in the process of being added. Support is present in almost all major shells, including Bash, ksh, and zsh. Ensure your shell supports it before relying on its use.

Sign up to request clarification or add additional context in comments.

1 Comment

This is a great approach, since it makes reading the data in Python as simple as os.environ['FOO'] (Py2) or os.environb[b'FOO'] (Py3), so you get the data in Python as raw bytes without needing to encode or decode at all. I'd completely forgotten about this feature of Bash, so thanks for the reminder!
8

You can 'unescape' your bytes in Python with:

import os
import sys

if sys.version_info[0] < 3:  # sadly, it's done differently in Python 2.x vs 3.x
    foo = os.environ["FOO"].decode('string_escape')  # since already in bytes...
else:
    foo = bytes(os.environ["FOO"], "utf-8").decode('unicode_escape')

1 Comment

Your Py3 solution produces a str, not a bytes object, and unnecessarily converts the string form to bytes. Replace that second line with: foo = os.environb[b'FOO'].decode('unicode-escape').encode('latin-1') to make it read from os.environb (the bytes-oriented view of the environment), decode the escapes, then convert back to raw bytes (latin-1 is a 1-1 mapping that maps the first 256 Unicode ordinals to their ordinal value as bytes).
3

With zwer's answer I tried the following

first from bash (this is the same binary literal given by ybakos)

export FOO='\x1b\xba\x94(\xae\xd0\xb2\xa6\xf2f\xf6\x1fI\xed\xbao$\xc6D\x08\xba\x81\x96v'

then I launched the python shell (I have python 3.5.2)

>>> import os
>>> # ybakos's original binary literal
>>> foo =  b'\x1b\xba\x94(\xae\xd0\xb2\xa6\xf2f\xf6\x1fI\xed\xbao$\xc6D\x08\xba\x81\x96v'
>>> # ewer's python 3.x solution
>>> FOO = bytes(os.environ["FOO"], "utf-8").decode('unicode_escape')
>>> foo == FOO
False
>>> ^D

The last line of foo == FOO should return true, so the solution does not appear to work correctly.

I noticed that there is an os.envirnb dictionary, but I couldn't figure out to set an Environment Variable to a binary literal, so I tried the following alternative which uses base64 encoding to get an ASCII version of the binary literal.

First launch python shell

>>> import os
>>> import base64
>>> foo = os.urandom(24)
>>> foo
b'{\xd9q\x90\x8b\xba\xecv\xb3\xcb\x1e<\xd7\xba\xf1\xb4\x99\xf056\x90U\x16\xae'
>>> foo_base64 = base64.b64encode(foo)
>>> foo_base64
b'e9lxkIu67Hazyx4817rxtJnwNTaQVRau'
>>> ^D

Then in the bash shell

export FOO_BASE64='e9lxkIu67Hazyx4817rxtJnwNTaQVRau'

Then back in the python shell

>>> import os
>>> import base64
>>> # the original binary value from the first python shell session
>>> foo = b'{\xd9q\x90\x8b\xba\xecv\xb3\xcb\x1e<\xd7\xba\xf1\xb4\x99\xf056\x90U\x16\xae'
>>> dec_foo = base64.b64decode(bytes(os.environ.get('FOO_BASE64'), "utf-8"))
>>> # the values match!
>>> foo == dec_foo
True
>>> ^D

The last line shows that the 2 results are the same!!

What we are doing, is first getting a binary value from os.urandom() and Base64 encoding it. We then use the Base64 encoded value to set the environment variable. Note: base64.b64encode() returns a binary value, but it will only contain printable ASCII characters.

Then in our program we read in the Base64 encode string value from the environment variable, convert the string into it's binary form, and finally Base64 decode it back to its original value.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.