1

I have a string like this:

text = 'b\'"Bill of the  one\\xe2\\x80\\x99s store wanted to go outside.\''

That is clearly meant to be byte formatted, however when I look at the object's type, it returns:

type(text)  
<class 'str'>

I tried encoding at byte and then decoding, but this was the result:

text.encode("utf-8").decode("utf-8")
'b\'"Bill of the oneâ\x80\x99s store wanted to go outside.\''

How can I get the text properly formatted?

2 Answers 2

3

As another possible approach, it seems to me that the string you have is the result of calling repr on a byte object. You can reverse a repr by calling ast.literal_eval:

>>> import ast
>>> x = b'test string'
>>> y = repr(x)
>>> y
"b'test string'"
>>> ast.literal_eval(y)
b'test string'

Or in your case:

>>> x = 'b\'"Bill of the  one\\xe2\\x80\\x99s store wanted to go outside.\''
>>> import ast
>>> ast.literal_eval(x)
b'"Bill of the  one\xe2\x80\x99s store wanted to go outside.'
Sign up to request clarification or add additional context in comments.

Comments

1

Why are you doing both encode and decode on the string object if you do so you will anyhow come to the same state (i.e) string, just encode that is sufficient.

text = 'b\'"Bill of the  one\\xe2\\x80\\x99s store wanted to go outside.\''
type(text) #This will output <class 'str'>

Now, for byte object just make use of below snippet

byte_object=text.encode("utf-8")
type(byte_object) #This will output <class 'bytes'>

2 Comments

Right, but now byte_object == b'b\'"Bill of the one\\xe2\\x80\\x99s store wanted to go outside.\''
ok, I was not clear with the question on seeing @brianpck I can understand your requirement u can make use ast which is meant for this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.