How do I decode escaped unicode javascript code in Python?

Question

I have this string:

V posledn\u00edch m\u011bs\u00edc\u00edch se bezpe\u010dnostn\u00ed situace v Libyi zna\u010dn\u011b zhor\u0161ila, o \u010dem\u017e sv\u011bd\u010d\u00ed i ned\u00e1vn\u00e9 n\u00e1hl\u00e9 opu\u0161t\u011bn\u00ed zem\u011b nejen \u010desk\u00fdmi diplomaty. Libyi hroz\u00ed nekontrolovan\u00fd rozpad a nekone\u010d

Which should read "V posledních měsících se ..." so \u00ed is í and \u011b is ě.

Any idea how to decode this in Python? It is a javascript code I am parsing in python. I could write my own ad-hoc solution as there are not that many characters that are escaped (there are only twelve or so accented characters in Czech), but that seems ugly.

Jossef Harush Kadouri · Accepted Answer · 2022-07-01 19:14:20Z

11

Decode it using the 'unicode-escape' codec. If x is your string, x.decode('unicode-escape').

edited Jul 1, 2022 at 19:14

Jossef Harush Kadouri

34.6k10 gold badges143 silver badges133 bronze badges

answered Aug 23, 2014 at 1:27

BrenBarn

253k39 gold badges421 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Gallaecio Over a year ago

'\u2019'.decode('unicode-escape') gives me u'\u2019' (Python 2.7.17)

Gallaecio Over a year ago

My bad, r'\u2019'.decode('unicode-escape') gives u'\u2019', which printed gives ’ as expected

Jossef Harush Kadouri Over a year ago

.encode().decode('unicode-escape') If you are dealing with a string in python that is already encoded like this.

Ned Batchelder · Accepted Answer · 2014-08-23 01:29:50Z

1

If it is Javascript code, then perhaps it's actually JSON, and you can use json.loads to decode it.

answered Aug 23, 2014 at 1:29

Ned Batchelder

378k77 gold badges583 silver badges675 bronze badges

1 Comment

sup Over a year ago

That does not seem to work straigth away (it says it is no json) and the answer by BrenBarn actually works great, thanks though!

Patrick Sampaio · Accepted Answer · 2021-02-01 17:57:41Z

0

I had a similar issue, was solved by:

unicodedata.normalize('NFD', my_string.decode('unicode-escape')).encode('ascii','ignore')

answered Feb 1, 2021 at 17:57

Patrick Sampaio

3902 silver badges13 bronze badges

Collectives™ on Stack Overflow

How do I decode escaped unicode javascript code in Python?

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related