My string is Niệm Bồ Tát (Thiá»n sư Nhất Hạnh) and I want to decode it to Niệm Bồ Tát (Thiền sư Nhất Hạnh). I see in that site can do that http://www.enderminh.com/minh/utf8-to-unicode-converter.aspx
and I start to try by Python
mystr = '09. Bát Nhã Tâm Kinh'
mystr.decode('utf-8')
but actually it is not correct because original string is utf-8 but the string show is not my expecting result.
Note: it is Vietnamese character.
How to resolve that case? Is that Windows Unicode or something? How to detect the encoding here.
utf-8but interpreted aslatin-1.>>> "Niệm Bồ Tát (Thiền sư Nhất Hạnh)".encode('utf-8').decode('latin-1')'Niá»\x87m Bá»\x93 Tát (Thiá»\x81n sư Nhất Hạnh)'pretty close...