0

I have a function that should save the xml from response to file. Input arguments are response and name of file (objNm:)

def getXml ( response, objNm):
    root = ET.fromstring(response.text)
    tree = ET.ElementTree(root)
    xmlNm = objNm + ".xml"
    tree.write(open(xmlNm, 'w'), encoding='unicode')
    print('Object {} was succsessfully created.'.format(xmlNm))

That returns me an error:

Traceback (most recent call last): File "test.py", line 56, 
    in <module> getXml(response, 'test_example') 
    File "test.py", line 17, in getXml root = ET.fromstring(response.text) 
    File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1300, in XML parser.feed(text) 
    File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1640, in feed self._parser.Parse(data, 0) 
    UnicodeEncodeError: 'ascii' codec can't encode characters in position 142489-142490: ordinal not in range(128)

An error with using root = ET.fromstring(response.text.decode('utf-8'))

Traceback (most recent call last):
  File "test.py", line 56, in <module>
    getXml(response, 'test_example')
  File "test.py", line 17, in getXml
    root = ET.fromstring(response.text.decode('utf-8'))
  File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
     return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 142489-142490: ordinal not in range(128)

I have tried encoding utf 8, did not help either.

Can anybody halp me eliminate this error?

5
  • At what line does this raise? Commented Apr 9, 2019 at 11:17
  • Can you copy here by any chance the text between these indexes? 142489-142490 ? Theoretically you could do a slice like response.text[142489:142490+1] Commented Apr 9, 2019 at 12:03
  • ë that's what it gives after the slice Commented Apr 9, 2019 at 12:09
  • And I'm assuming that type(response.text) yields bytes ? Commented Apr 9, 2019 at 12:21
  • it's <type 'unicode'> Commented Apr 9, 2019 at 12:24

1 Answer 1

1

If you're using python2.7 typically the python files are open by default in ascii mode. You need to specify # -*- coding: utf-8 -*- at the top of your file.

Some other things that can be done:

calling encoded_text = response.text.encode('utf-8', 'replace') and then using that for the fromstring(encoded_text).

Tested via:

import codecs
data = u'abcdëëaaë'
data = data.encode('utf-8', 'replace')
something = codecs.utf_8_decode(data, 'strict', True)
print something

An alternative is to set utf-8 system wide like:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')
Sign up to request clarification or add additional context in comments.

4 Comments

@JanFi86 check my answer regarding the file encoding # -*- coding: utf-8 -*-
@JanFi86 really?? ... I was trying the following codecs.utf_8_decode(u'abcdë', 'strict', True) in my python console but still got some errors even with specifying the utf-8 encoding... now I'm baffled.
@JanFi86 check my answer again
calling encoded_text = response.text.encode('utf-8', 'replace') helped definitely, doublechecked, looks fine, thank you so much

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.