How to replace characters that cannot be decoded using utf8 with whitespace?
# -*- coding: utf-8 -*-
print unicode('\x97', errors='ignore') # print out nothing
print unicode('ABC\x97abc', errors='ignore') # print out ABCabc
How can I print out ABC abc instead of ABCabc? Note, \x97 is just an example character. The characters that cannot be decoded are unknown inputs.
- If we use
errors='ignore', it will print out nothing. - If we use
errors='replace', it will replace that character with some special chars.