How does one add string prefixes to variables in python?

Question

The term 'string prefix' is explained here.

What if you have a string that has been assigned to a variable already, how do you add the string prefix to that (without using the same string)? This can be assigned a new variable, or reassigned to the same one.

"String Encoding declarations" is not an actual term, and it would be a terrible term if it was, as it has nothing to do with encoding. Some rando just edited their own made-up terminology into that answer. — user2357112
– user2357112, Commented Jul 8, 2020 at 19:53
@user2357112 supports Monica people make up new terms all the time, the way language works is if you understand what they are saying. Do you understand what they are talking about? — user13894236
– user13894236, Commented Jul 8, 2020 at 19:56
If it didn't literally appear in the sentence "'Letters before strings here are called "String Encoding declarations".", I would not know what they were talking about. I would have guessed it was a misleading term for a PEP 263 encoding comment, which is a completely different thing. There is only one google hit for python "string encoding declaration" using the "term" in this way, and it's by the guy who made the edit. — user2357112
– user2357112, Commented Jul 8, 2020 at 20:03

ShadowRanger · Accepted Answer · 2020-07-08 20:36:57Z

2

You can't retroactively add or remove a string literal prefix. Once its been made, it's just a str (or bytes with a b prefix). If you need to convert something that was a bytes literal to str or vice-versa, you use the bytes.decode or str.encode method respectively, like you would on any bytes or str, regardless of whether it began as a literal or not, because there is no difference between literal and non-literal strings immediately after the literal is evaluated.

answered Jul 8, 2020 at 20:36

ShadowRanger

158k12 gold badges221 silver badges315 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

ShadowRanger Over a year ago

Pedantic note I didn't want in the answer itself: Technically, str literals might be interned where non-literal strs rarely are, but this is not relevant to use unless you're incorrectly using is for string comparisons.

ShadowRanger Over a year ago

@dsfgh: If you're on Python 3, they're both the same string. u is an optional, meaningless prefix on Python 3 for aid in porting Python 2 scripts. It just says "this string can contain Unicode", but on Python 3, that's how all str are already.

ShadowRanger Over a year ago

@dsfgh: Did you see my first sentence? It cannot be done in the general case. You can change certain str to look like bytes literals ('stng'.encode('latin-1')) but that assumes they contain only latin-1 characters; if they contain non-latin-1 characters, there is no single equivalent byte. Literals are literals; the literal itself possesses or lacks a prefix, you can't retroactively change it.

ShadowRanger Over a year ago

@dsfgh: Trying to write such a prefix selecting function is difficult to the point where, given the limited utility of such a function and the inability to cover all cases, the "benefits" are not worth the trouble of writing. The whole idea is nonsensical to start with, since, as I said, the prefixes only have meaning in terms of literals; stuff like r and f prefixes are literally impossible to graft on after the fact with anything but the loosest heuristics, and b is almost as bad.

user2357112 Over a year ago

@dsfgh: You're trying to unscramble a scrambled egg and boil it instead.

|

wjandrea · Accepted Answer · 2020-07-09 00:20:13Z

In general, you can't. String prefixes are part of syntax, not data. In other words, they don't create a different type of string, but create a string in a different way.

u does nothing in Python 3. It only exists for compatibility with Python 2.
f can be emulated with str.format() for simple cases, but to fully emulate an f-string, you'd have to evaluate it, but that's a security risk since f-strings can contain arbitrary code.
r can be emulated with str.encode('unicode_escape').decode() in some cases, but not all, for example, this string literal is lossy:
```
>>> r'\x61'
'\\x61'
>>> s = '\x61'
>>> s
'a'
>>> s.encode('unicode_escape').decode()
'a'
```

b is an exception in that it actually does create a different type: a bytes object. It can be emulated with the raw_unicode_escape encoding, though I don't have any experience using it so I'm not sure if it's the same:

>>> b'a\x89\u2013'
b'a\x89\\u2013'
>>> 'a\x89\u2013'
'a\x89–'
>>> 'a\x89\u2013'.encode('raw_unicode_escape')
b'a\x89\\u2013'
>>> 'a\x89\u2013'.encode('raw_unicode_escape').decode('raw_unicode_escape')
'a\x89–'

Also just for reference, the grammar calls them stringprefix, and just "prefix" in the text.

Stathis Alexopoulos · Accepted Answer · 2020-07-08 20:45:03Z

According to Python 2 manual

Unicode Literals in Python Source Code

In Python source code, Unicode literals are written as strings prefixed with the ‘u’ or ‘U’ character: u'abcdefghijk'. Specific code points can be written using the \u escape sequence, which is followed by four hex digits giving the code point. The \U escape sequence is similar, but expects 8 hex digits, not 4.

But in Python 3

The String Type

Since Python 3.0, the language’s str type contains Unicode characters, meaning any string created using "unicode rocks!", 'unicode rocks!', or the triple-quoted string syntax is stored as Unicode.

The default encoding for Python source code is UTF-8, so you can simply include a Unicode character in a string literal:

As far as it concerns the already created variables, either by user input or by reading a file or whatever, you have to read on each method how to manipulate unicodes

Collectives™ on Stack Overflow

How does one add string prefixes to variables in python?

3 Answers 3

10 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

10 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related