How to extract a unicode string from a string

Question

I've a string of the form "text: u'\u0644'", how to extract in python the inner unicode string? (i.e. to have u'\u0644')

When I use split() I got "u'\\u0644'" which is a simple string!

If this is JSON, use json.loads to decode it to Python data structures. — Ned Batchelder
– Ned Batchelder, Commented Aug 24, 2014 at 14:45
where did it come from? Likely there's an easier way to get the information out of it. — Ned Batchelder
– Ned Batchelder, Commented Aug 24, 2014 at 14:47
the text was crawled from facebook, wonder if this may help! — bachr
– bachr, Commented Aug 24, 2014 at 14:49
1) Facebook has an API which will be easier than screen-scraping their HTML, and 2) the text you are getting from their HTML is likely valid JSON. I strongly suggest that you back up and reconsider how you are approaching this.... — Ned Batchelder
– Ned Batchelder, Commented Aug 24, 2014 at 14:50

mhawke · Accepted Answer · 2014-08-24 14:50:41Z

1

You can use ast.literal_eval() to safely convert the literal string:

>>> from ast import literal_eval

>>> s = "text: u'\u0644'"

>>> unicode_part = s.split(':')[-1].strip()
>>> unicode_part
"u'\\u0644'"

>>> unicode_string = literal_eval(unicode_part)
>>> unicode_string
u'\u0644'
>>> print unicode_string
ل

answered Aug 24, 2014 at 14:50

mhawke

87.5k10 gold badges122 silver badges142 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Padraic Cunningham Over a year ago

why not just split on whitespace?

mhawke Over a year ago

You can, and originally I did that, but the string looks like it's some sort of key value pair where the key and value are delimited by :, hence using : for the split. If you could be certain that there was always a space, then you could split on the space, or even on : if there was always one space and avoid the .strip() - but this way is probably more robust.

Collectives™ on Stack Overflow

How to extract a unicode string from a string

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related