0

Is it possible to convert a string to unicode characters in Python?

For example: Hello => \u0048\u0045\u004C\u004C\u004F

1 Answer 1

1

You can't. A string already consists of 'Unicode characters'. What you want to change is not its representation but its actual contents.

That, fortunately, is as simple as

import re
text = 'Hello'
print (re.sub('.', lambda x: r'\u%04X' % ord(x.group()), text))

which outputs

\u0048\u0065\u006C\u006C\u006F
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you! What is this conversion called? I've been googling for almost an hour now. I would assume this has already been asked and been answered somewhere.
It's a regular expression that replaces each character with its own Unicode string through a custom replacement function. But there are several other approaches possible – this is only one. The basic idea is that each character gets replaced with another string.
I know it's a regular expression and it's using a custom replacement function, I was just asking about converting character H to \u0048..
print(''.join(r'\u{:04X}'.format(ord(char)) for char in 'Hello'))
@furas I like that version too, looks pythonic. :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.