1

The term 'string prefix' is explained here.

What if you have a string that has been assigned to a variable already, how do you add the string prefix to that (without using the same string)? This can be assigned a new variable, or reassigned to the same one.

18
  • 1
    "String Encoding declarations" is not an actual term, and it would be a terrible term if it was, as it has nothing to do with encoding. Some rando just edited their own made-up terminology into that answer. Commented Jul 8, 2020 at 19:53
  • @user2357112 supports Monica Then what is it called? Commented Jul 8, 2020 at 19:53
  • I don't think there's a name. Commented Jul 8, 2020 at 19:55
  • @user2357112 supports Monica people make up new terms all the time, the way language works is if you understand what they are saying. Do you understand what they are talking about? Commented Jul 8, 2020 at 19:56
  • If it didn't literally appear in the sentence "'Letters before strings here are called "String Encoding declarations".", I would not know what they were talking about. I would have guessed it was a misleading term for a PEP 263 encoding comment, which is a completely different thing. There is only one google hit for python "string encoding declaration" using the "term" in this way, and it's by the guy who made the edit. Commented Jul 8, 2020 at 20:03

3 Answers 3

2

You can't retroactively add or remove a string literal prefix. Once its been made, it's just a str (or bytes with a b prefix). If you need to convert something that was a bytes literal to str or vice-versa, you use the bytes.decode or str.encode method respectively, like you would on any bytes or str, regardless of whether it began as a literal or not, because there is no difference between literal and non-literal strings immediately after the literal is evaluated.

Sign up to request clarification or add additional context in comments.

10 Comments

Pedantic note I didn't want in the answer itself: Technically, str literals might be interned where non-literal strs rarely are, but this is not relevant to use unless you're incorrectly using is for string comparisons.
@dsfgh: If you're on Python 3, they're both the same string. u is an optional, meaningless prefix on Python 3 for aid in porting Python 2 scripts. It just says "this string can contain Unicode", but on Python 3, that's how all str are already.
@dsfgh: Did you see my first sentence? It cannot be done in the general case. You can change certain str to look like bytes literals ('stng'.encode('latin-1')) but that assumes they contain only latin-1 characters; if they contain non-latin-1 characters, there is no single equivalent byte. Literals are literals; the literal itself possesses or lacks a prefix, you can't retroactively change it.
@dsfgh: Trying to write such a prefix selecting function is difficult to the point where, given the limited utility of such a function and the inability to cover all cases, the "benefits" are not worth the trouble of writing. The whole idea is nonsensical to start with, since, as I said, the prefixes only have meaning in terms of literals; stuff like r and f prefixes are literally impossible to graft on after the fact with anything but the loosest heuristics, and b is almost as bad.
@dsfgh: You're trying to unscramble a scrambled egg and boil it instead.
|
1

In general, you can't. String prefixes are part of syntax, not data. In other words, they don't create a different type of string, but create a string in a different way.

Also just for reference, the grammar calls them stringprefix, and just "prefix" in the text.

Comments

0

According to Python 2 manual

Unicode Literals in Python Source Code

In Python source code, Unicode literals are written as strings prefixed with the ‘u’ or ‘U’ character: u'abcdefghijk'. Specific code points can be written using the \u escape sequence, which is followed by four hex digits giving the code point. The \U escape sequence is similar, but expects 8 hex digits, not 4.

But in Python 3

The String Type

Since Python 3.0, the language’s str type contains Unicode characters, meaning any string created using "unicode rocks!", 'unicode rocks!', or the triple-quoted string syntax is stored as Unicode.

The default encoding for Python source code is UTF-8, so you can simply include a Unicode character in a string literal:

As far as it concerns the already created variables, either by user input or by reading a file or whatever, you have to read on each method how to manipulate unicodes

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.