0

I have a non-English list of rows where each row is a list of strings and ints. I need to write this data to a file and convert all numbers to strings accordingly. The data contents is the following:

[[u'12', u'as', u'ss', u'ge', u'ge', u'm\xfcnze', u'10.0', u'25.2', u'68.05', 1, 2, 0],
[u'13', u'aas', u'sss', u'tge', u'a', u'mat', u'11.0', u'35.7', u'10.1', 1, 1, 1], ...]

The loop breaks on the first list which contains u'm\xfcnze'.

import codecs

with codecs.open("temp.txt", "w", encoding="utf-8") as f:
    for row in data:
        f.write(' '.join([str(r) for r in row]))
        f.write('\n')

The code above fails with UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 38: ordinal not in range(128) error.

Trying r.encode('utf-8') if isinstance(r, str) does not solve this issue, so what am I doing wrong?

2
  • What is data? I would like to know data type and structure to be able to help you. Commented Sep 1, 2017 at 18:40
  • @gsi-frank I have updated the question Commented Sep 1, 2017 at 18:50

1 Answer 1

2

This should work:

import codecs

with codecs.open("temp.txt", "w", encoding="utf-8") as f:
    for row in data:
        f.write(' '.join([unicode(r) for r in row]))
        f.write('\n')

I'm using the unicode() function

Note, because Python 3 string data type is string unicode, your code works fine in Python 3 without any modification (no str -> unicode needed)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.