0

I am a new learner of python. I want to have a list of strings with non-ASCII characters.

This answer suggested a way to do this, but when I tried a code, I got some weird results. Please see the following MWE -

#-*- coding: utf-8 -*-
mylist = ["अ,ब,क"]
print mylist

The output was ['\xe0\xa4\x85,\xe0\xa4\xac,\xe0\xa4\x95']

When I use ASCII characters in the list, let's say ["a,b,c"] the output also is ['a,b,c']. I want the output of my code to be ["अ,ब,क"]

How to do this?

PS - I am using python 2.7.16

1
  • 1
    If you are only just learning the basics, you should probably ignore Python 2, and spend your time on the currently recommended and supported version of the language, which is Python 3. Commented Aug 17, 2019 at 7:41

2 Answers 2

2

You want to mark these as Unicode strings.

mylist = [u"अ,ब,क"]

Depending on what you want to accomplish, if the data is just a single string, it might not need to be in a list. Or perhaps you want a list of strings?

mylist = [u"अ", u"ब", u"क"]

Python 3 brings a lot of relief to working with Unicode (and doesn't need the u sigil in front of Unicode strings, because all strings are Unicode), and should definitely be your learning target unless you are specifically tasked with maintaining legacy software after Python 2 is officially abandoned at the end of this year.

Regardless of your Python version, there may still be issues with displaying Unicode on your system, in particular on older systems and on Windows.

If you are unfamiliar with encoding issues, you'll want to read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) and perhaps the Python-specific Pragmatic Unicode.

Sign up to request clarification or add additional context in comments.

Comments

0

Use:

#-*- coding: utf-8 -*-
mylist = ["अ,ब,क"]
print [unicode(i) for i in mylist]

Or use:

#-*- coding: utf-8 -*-
mylist = ["अ,ब,क"]
print map(unicode, mylist)

2 Comments

I got this error. Traceback (most recent call last): File "Test.py", line 3, in <module> print map(unicode, mylist) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)
For the first code the error was Traceback (most recent call last): File "Test.py", line 3, in <module> print [unicode(i) for i in mylist] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.