1

I was going through this QA where it is said that char array when initialized with string literal will cause two memory allocations one for variable and other for string literal.

I have written below program to see how is the memory allocated.

#include <stdio.h>
#include <string.h>

int main()
{
    char a[] = "123454321";
    
    printf("a =%p and &a = %p\n", a, &a);

    for(int i = 0; i< strlen(a); i++)
        printf("&a[%d] =%p and a[%d] = %c\n",i,&a[i],i,a[i]);
    
    return 0;
}

and the output is:

a =0x7ffdae87858e and &a = 0x7ffdae87858e                                                                             
&a[0] =0x7ffdae87858e and a[0] = 1                                                                                    
&a[1] =0x7ffdae87858f and a[1] = 2                                                                                    
&a[2] =0x7ffdae878590 and a[2] = 3                                                                                    
&a[3] =0x7ffdae878591 and a[3] = 4                                                                                    
&a[4] =0x7ffdae878592 and a[4] = 5                                                                                    
&a[5] =0x7ffdae878593 and a[5] = 4                                                                                    
&a[6] =0x7ffdae878594 and a[6] = 3                                                                                    
&a[7] =0x7ffdae878595 and a[7] = 2                                                                                    
&a[8] =0x7ffdae878596 and a[8] = 1

From the output it does not look like we have two separate memory locations for array and string literal.

If we have separate memory for array and string literal, is there any way we can prove array a and string literal stores separately in this scenario?

link to clone: https://onlinegdb.com/HkJhdSHyd

5
  • What do you mean by "2 separate storages" exactly? All we have is one array whose contents are the string "123454321". Commented Jan 20, 2021 at 6:09
  • @DavidSchwartz, please refer the answer in the link provided, I was confused with answer, so asked a separate question Commented Jan 20, 2021 at 6:12
  • Define "allocations". There is one variable a[] initialized with a constant string. Commented Jan 20, 2021 at 6:13
  • All of the below answers are correct in their own way, so shall I just conclude that its up to the compiler how it wants to store the string literal (either can make a copy or emit compilers own code or it may use same memory as that of array) Commented Jan 23, 2021 at 10:46
  • @IrAM First two would be the common implementations, the last one "or it may use same memory as that of array" is very unlikely. The array char a[] = "..."; is an automatic variable, and it would usually be allocated on the stack, which is eminently dynamic and can not be statically initialized. In this case however the compiler could maybe determine that main is called only once, and the array is accessed only once, so it could technically generate code equivalent to static char a[] = "..."; but, again, that would be least likely, and not possible at all except in trivial cases. Commented Jan 23, 2021 at 18:38

5 Answers 5

4

char a[] = "123454321";

Technically, the string literal "123454321" is not required to be stored anywhere as such. All that's required is that a[] be initialized with the right values when main is entered. Whether that's done by copying the string from some static read-only memory location, or running code that fills it in some other way is not mandated by the standard.

As far as the standard goes, it would be perfectly acceptable for the compiler to emit code equivalent to the following in order to initialize a[]:

char a[10];
for(int n = 0; n <= 4; n++)
    a[n] = a[8-n] = '1' + n;
a[9] = '\0';

In fact, at least one compiler (gcc) initializes a[] via custom code, rather than storing and copying the literal string.

mov     DWORD PTR [ebp-22], 875770417    ; =  0x34333231  =  '1', '2', '3', '4'
mov     DWORD PTR [ebp-18], 842216501    ; =  0x32333435  =  '5`, '4', '3', '2'
mov     WORD  PTR [ebp-14], 49           ; =  0x31        =  '1', '\0'
Sign up to request clarification or add additional context in comments.

4 Comments

So we cannot really say we have separate memory for variable and literal , variable will have enough memory to store the literal, literal can be stored somewhere else or not is not defined, is my understanding correct?
The statement char a[] = "123454321"; defines and initializes one variable. How the array gets initialized is not specified or mandated by the standard. Compare this to const char *p = "123454321" which defines and initializes a pointer variable, while at the same time guarantees that a const char[10] exists somewhere which holds the literal "123454321" string.
@IrAM See my answer. You can definitely say there has to be separate memory for the variable and the literal because they can have different contexts. The literal has to be stored somewhere else, though due to C's "as-if" rule, that could be in code.
@dxiv, I agree with you , so my today's comment under my question is valid?
1

You can prove it by modifying the code as follows:

int main()
{
    for (int i = 0; i < 2; ++i)
    {
        char a[] = "123454321";

        printf("a = %s\n", a);
        a[3] = 'x';
        a[5] = 'y';
        printf("a = %s\n", a);
    }
}

Output:

a = 123454321
a = 123x5y321
a = 123454321
a = 123x5y321

We got the original string back after modifying it, so the original string must have been stored somewhere other than the place we modified.

8 Comments

yes, it looks like if i use printf("a = %s and &a = %p\n", a, &a);, i see the address is same always though the string is changing , is that undefined behavior or correct behavior
@IrAM The implementation can do it either way. Each iteration of the loop creates a new instance of a. The old instance's lifetime is over, so the implementation is free to re-use its storage but it is not required to.
This answer is not correct. The compiler does not need to store "123454321". It can just allocate memory initialized with '1','2','3','4'..... for a
@ServeLaurijssen How can it allocate memory initialized with particular contents if it does not store that contents somewhere?
see divx's answer. it can be custom code. it could generate a series of 'mov', there's no guarantee that an array is stored somewhere and the contents copied into the other array
|
1

You've completely misunderstood the question and answer. The question was about whether the initializer string consumes memory in addition to the actual array. Now the thing is, you cannot observe the initializer string.

It is like there are two sheets of paper. One in the closet with 123454321 written with ballpoint pen. One on the desk - initially empty. Then someone else comes, takes the sheet from the closet, reads the text on it, and writes it on the sheet on the desk using a pencil. Then puts the paper back into closet.

Now you're looking at that sheet on desk saying: "clearly the text 123454321 has not been written twice onto this sheet, hence what do they say about there being two copies?"

2 Comments

Your Statement : The question was about whether the initializer string consumes memory in addition to the actual array. My Statement: char array when initialized with string literal will cause two memory allocations one for variable and other for string literal.
I am sorry, I don't see any difference or my English is bad?
1

Yes, your string is stored in two places - one place in the pre initialized data section of the executable, and then as your program runs, it is copied into a second place in the working memory section of the program.

Note - on a linux system you can run the program strings on the binary to ferret out any strings tucked away in the code.

% cc myprogram.c -o myprogram
% strings myprogram

The memory layout of your executable is well defined by the compiler build system, and is indicated to the operating system by the use of a code number at the beginning (called a Magic Number, by the way).

A typical layout has:

  • Text segment (i.e. instructions)
  • Initialized data segment
  • Uninitialized data segment (bss)
  • Heap
  • Stack

Your programming code (c, c++, fortran...) is converted to assembly language appropriate for the machine you intend to run the code and stored in the Text section. Data whose value is known at compile time (such as the "123454321") is allocated an address, and is stored in the Initialized data segment by the compiler. Data that will be created during execution is allocated an address in the Uninitialized data segment, and will be initialized to 0 by the operating system's exec function when it starts your program. A variable called Block Starting Symbol (BSS) is assigned to the address where this uninitialized data starts, and the o/s uses that to know where to start writing zeros. Next, the Heap is where dynamic memory allocation (malloc) gets memory; and the stack is where the state of your program is saved when the program calls a sub routine to do work for the calling function.

See pages like https://www.geeksforgeeks.org/memory-layout-of-c-program/ (if it still exists!) for deeper explanation.

Comments

0

You cant prove there's two storages because you have only one.

The compiler sees you want a char array initialized with some characters and '\0' so it does that. It does not need to store the string literal somewhere else.

This would not compile for that reason.

#include <stdio.h>
#include <string.h>

char *p = "123454321";

int main()
{
    char a[] = p;
    
    printf("a =%p and &a = %p\n", a, &a);

    for(int i = 0; i< strlen(a); i++)
        printf("&a[%d] =%p and a[%d] = %c\n",i,&a[i],i,a[i]);
    
    return 0;
}

2 Comments

Then below answer conflicts with this answer
This cannot be correct. The string literal must be stored somewhere in some form or it would be impossible to initialize the array to it. And it can't be stored in only one place because if it was, modifying the array would cause the code to fail if you executed it again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.