3

I have a few questions I would like to ask about string literals and C-strings.

So if I have something like this:

char cstr[] = "c-string";

As I understand it, the string literal is created in memory with a terminating null byte, say for example starting at address 0xA0 and ending at 0xA9, and from there the address is returned and/or casted to type char [ ] which then points to the address.

It is then legal to perform this:

for (int i = 0; i < (sizeof(array)/sizeof(char)); ++i)
    cstr[i] = 97+i;

So in this sense, are string literals able to be modified as long as they are casted to the type char [ ] ?

But with regular pointers, I've come to understand that when they are pointed to a string literal in memory, they cannot modify the contents because most compilers mark that allocated memory as "Read-Only" in some lower bound address space for constants.

char * p = "const cstring";
*p = 'A'; // illegal memory write

I guess what I'm trying to understand is why aren't char * types allowed to point to string literals like arrays do and modify their constants? Why do the string literals not get casted into char *'s like they do to char [ ]'s? If I have the wrong idea here or am completely off, feel free to correct me.

4
  • char * p = "const cstring"; should throw a compilation error, since "const cstring" is type const char* (specifically so that you don't use it like you're using it in your example) Commented Oct 6, 2011 at 3:44
  • @tylerl "const cstring" is not of type const char*. Commented Jan 20, 2019 at 13:35
  • @LightnessRacesinOrbit as of C++11 string literals are of type const char[N], which is perdy much equivalent to const char*. You can argue the pedantic details, but that's only adding confusion, not clarity. Commented Jan 27, 2019 at 0:35
  • @tylerl On the contrary, those are completely different types, and pretending otherwise is how confusion is introduced. Commented Jan 27, 2019 at 1:21

4 Answers 4

5

The bit that you're missing is a little compiler magic where this:

char cstr[] = "c-string"; 

Actually executes like this:

char *cstr = alloca(strlen("c-string")+1);
memcpy(cstr,"c-string",strlen("c-string")+1);

You don't see that bit, but it's more or less what the code compiles to.

Sign up to request clarification or add additional context in comments.

2 Comments

This is definitely what I have been missing! The answer I chose as the selected answer basically put this into words, but the code is even cleaner ;) so really cstr is a const char * to locally allocated memory on the stack (or possibly the heap) that is a copy and is modifiable, unlike the literal string's values. thanks so much for showing me this.
It's worth pointing out that in this case cstr is not a const char* but rather a char*. A const char* is what you get with string literals. That means that you can't go modifying its contents. In contrast, a char* means the data is modifiable. Also, it's very definitely on the stack, not the heap, which is why you don't have to call free() on it, and why you can't return it to your caller.
2

char cstr[] = "something"; is declaring an automatic array initialized to the bytes 's', 'o', 'm', ...

char * cstr = "something";, on the other hand, is declaring a character pointer initialized to the address of the literal "something".

1 Comment

Thanks for the insight on this. I see now that arrays, being a standard type, are literally initialized in their constructor by string literals, but are only copies of the literal itself.
1

In the first case you are creating an actual array of characters, whose size is determined by the size of the literal you are initializing it with (8+1 bytes). The cstr variable is allocated memory on the stack, and the contents of the string literal (which in the code is located somewhere else, possibly in a read-only part of the memory) is copied into this variable.

In the second case, the local variable p is allocated memory on the stack as well, but its contents will be the address of the string literal you are initializing it with.

Thus, since the string literal may be located in a read-only memory, it is in general not safe to try to change it via the p pointer (you may get along with, or you may not). On the other hand, you can do whatever with the cstr array, because that is your local copy that just happens to have been initialized from the literal.

(Just one note: the cstr variable is of a type array of char and in most of contexts this translates to pointer to the first element of that array. Exception to this may be e.g. the sizeof operator: this one computes the size of the whole array, not just a pointer to the first element.)

2 Comments

Ah, I see. So the difference is that one is really just a pointer to the RO memory whilst the other is a variable of array of char (or constant pointer in other words) to a local copy of that literal in memory plus the null byte, and thus access to the copy is RW. Thanks for clarifying
Didn't always used to be RO memory, BTW. Modifying a "constant" string was an old C programmer's trick, before they started putting the strings in a read-only section.
1
char cstr[] = "c-string";

This copies "c-string" into a char array on the stack. It is legal to write to this memory.

char * p = "const cstring";
*p = 'A'; // illegal memory write

Literal strings like "c-string" and "const cstring" live in the data segment of your binary. This area is read-only. Above p points to memory in this area and it is illegal to write to that location. Since C++11 this is enforced more strongly than before, in that you must make it const char* p instead.

Related question here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.