13

I have bunch of byte strings (str, not unicode, in python 2.7) containing unicode data (in utf-8 encoding).

I am trying to join them( by "".join(utf8_strings) or u"".join(utf8_strings)) which throws

UnicodeDecodeError: 'ascii' codec can't decode byte 0xec in position 0: ordinal not in range(128)`

Is there any way to make use of .join() method for non-ascii strings? sure I can concatenate them in a for loop, but that wouldn't be cost-effective.

0

2 Answers 2

17

Joining byte strings using ''.join() works just fine; the error you see would only appear if you mixed unicode and str objects:

>>> utf8 = [u'\u0123'.encode('utf8'), u'\u0234'.encode('utf8')]
>>> ''.join(utf8)
'\xc4\xa3\xc8\xb4'
>>> u''.join(utf8)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: ordinal not in range(128)
>>> ''.join(utf8 + [u'unicode object'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: ordinal not in range(128)

The exceptions above are raised when using the Unicode value u'' as the joiner, and adding a Unicode string to the list of strings to join, respectively.

Sign up to request clarification or add additional context in comments.

2 Comments

how would one un-mix unicode and str objects then?
@fiona decide your byte strings to Unicode, then join. It's best to decode as early as possible, encode only when you are done with the text and must pass it on to something that'll only accept bytes.
2

"".join(...) will work if each parameter is a str (whatever the encoding may be).

The issue you are seeing is probably not related to the join, but the data you supply to it. Post more code so we can see what's really wrong.

1 Comment

thank for your help. the utf8_strings are data loaded by xlrd. xlrd, a magnificent python module, thankfully returns all (non-numerical) data in unicode. I fiddle with them, and it seems I made some of them str.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.