In Python3 (ipthon-qt terminal) I can do:
In [12]: b=[u'maça', u'banana',u'morango']
In [13]: np.savetxt('test.txt',b,fmt='%s')
In [14]: cat test.txt
ma�a
banana
morango
In [15]: with open('test1.txt','w') as f:
...: for l in b:
...: f.write('%s\n'%l)
...:
In [16]: cat test1.txt
maça
banana
morango
savetxt in both Py2 and 3 insists on saving in 'wb', byte mode. Your error line has that asbytes function.
In my example b is a list, but that doesn't matter.
In [17]: c=np.array(['maça', 'banana','morango'])
In [18]: c
Out[18]:
array(['maça', 'banana', 'morango'],
dtype='<U7')
writes the same. In py3 the default string type is unicode, so the u tag isn't needed - but is ok.
In Python2 I get your error with a plain write
>>> b=[u'maça' u'banana',u'morango']
>>> with open('test.txt','w') as f:
... for l in b:
... f.write('%s\n'%l)
...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 2: ordinal not in range(128)
adding the encode gives a nice output:
>>> b=[u'maça', u'banana',u'morango']
>>> with open('test.txt','w') as f:
... for l in b:
... f.write('%s\n'%l.encode('utf-8'))
0729:~/mypy$ cat test.txt
maça
banana
morango
encode is a string method, so has to be applied to the individual elements of an array (or list).
Back on the py3 side, if I use the encode I get:
In [26]: c1=np.array([l.encode('utf-8') for l in b])
In [27]: c1
Out[27]:
array([b'ma\xc3\xa7a', b'banana', b'morango'],
dtype='|S7')
In [28]: np.savetxt('test.txt',c1,fmt='%s')
In [29]: cat test.txt
b'ma\xc3\xa7a'
b'banana'
b'morango'
but with the correct format, the plain write works:
In [33]: with open('test1.txt','wb') as f:
...: for l in c1:
...: f.write(b'%s\n'%l)
...:
In [34]: cat test1.txt
maça
banana
morango
Such are the joys of mixing unicode and the 2 Python generations.
In case it helps, here's the code for the np.lib.npyio.asbytes function that np.savetxt uses (along with the wb file mode):
def asbytes(s): # py3?
if isinstance(s, bytes):
return s
return str(s).encode('latin1')
(note the encoding is fixed as 'latin1').
The np.char library applies a variety of string methods to the elements of a numpy array, so the np.array([x.encode...]) can be expressed as:
In [50]: np.char.encode(b,'utf-8')
Out[50]:
array([b'ma\xc3\xa7a', b'banana', b'morango'],
dtype='|S7')
This can be convenient, though past testing indicates that it is not a time saver. It still has to apply the Python method to each element.
wbmode?