I'm currently using a command to convert some Chinese characters into pinyin, which requires the string in my bash to be Unicode safe and put the result in another variable. I can run the following command normally:
chinese="你好"
to-pinyin.py $chinese
It will print the output as expected, however, since I want the output in a variable, I tried to do the following:
chinese="你好"
pinyin=$(to-pinyin.py $chinese)
And python will fail with:
Traceback (most recent call last):
File "/~/to-pinyin.py", line 31, in <module>
print pinyin.get(hanzi, delimiter=" ").capitalize()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u01d0' in position 1: ordinal not in range(128)
Same thing will happen with backticks. I think I will work around by writing the output to a file and to a conversion there, then load the strings to a variable. How else can I fix this so that I can avoid the workaround?
EDIT:
Per request here is the output of locale:
$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
SOLUTION USED
Thanks to muru's response and some help of this other answer I added .encode('utf-8') to the end of the printed strings in my python script.
I wish I could switch to python3, but there is no defalut pinyin package there and I can't seem to install any good pinyin package that would get my job done quickly in python3. I remember trying for a while but python3 kept refusing to import the package I had installed, so I just installed one in python2 and it worked straight out of the box.
locale?to-pinyin.py "$chinese"