5

I'm newbie in C language. I'm trying to understanding concept of array in C. I have a confusion about array initialize.

Which is better way to initialize array of characters using string literal?

char arr[3] = "xyz";

or

char arr[] = "xyz";

Thanks in advance.

6
  • 9
    Likely the 2nd way, because it prevents the bug that you've introduced in your 1st example. Commented Feb 5, 2017 at 5:59
  • Let the compiler do the work for you. The 2nd way. Commented Feb 5, 2017 at 6:43
  • 1
    It is interesting / strange that gcc gives a warning for "xyzz" but not "xyz" without arguments, but both are problematic. Your first example should be char arr[4] = "xyz"; Commented Feb 5, 2017 at 6:52
  • The first way is good if you really want, for whatever reason, an array of the specified length. For example, room for 30 characters but initialized (initially) with only 10. I like the "initially initialized"... Commented Feb 5, 2017 at 7:19
  • 1
    @asimes "xyz" version is legal code, the other is a constraint violation Commented Feb 5, 2017 at 9:10

4 Answers 4

8

Unless under special circumstances, always prefer the second way, that is, by not typing the array's size explicitly. This avoids the bug you seemingly unnoticedly created in your example.

To understand this, you should first understand what exactly is a string. The null character is denoted by '\0'. A string is a series of zero or more non-null chars, terminated by a single null character. This last bit is very important. Look at the following code:

const char* my_string = "xyz";
size_t string_len = strlen( my_string ); // string_Len == 3

A pointer is just a memory address. It doesn't hold any type of size or length information in itself. Then, how can strlen() measure my_string's length? This is, of course, by measuring the amount of non-null characters from the string's beginning to just before the terminating null character. You might have now noticed that the terminating null character is implicit in a string literal. The above string literal creates an array in memory that looks like this:

 _______ _______ _______ _______
|       |       |       |       |
|  'x'  |  'y'  |  'z'  | '\0'  |
|_______|_______|_______|_______|
    ^
    |
`my_string` is a pointer to this cell

The array itself goes unnamed, but the compiler manages to give its first element's address as my_string's value. So, what happens with your first example?

char my_string[ 3 ] = "abc";

By the standard's definition a string literal has type char[ N ], where N is the string's length plus one to count for the null character (note that string literals are not declared const for historical reasons, but it is still undefined behavior to modify them). Thus, the above expression "abc" has the type char[ 4 ]. my_string, on the other hand, (which is now an array, not a pointer, BTW) has type char[ 3 ]. That is, you're setting a smaller array to a larger array, since 4 > 3. The standard mandates that, in this precise situation, where the null character of a string literal doesn't fit into the array, shall it be cut off. Thus, my_string looks like this in memory:

 _______ _______ _______
|       |       |       |
|  'a'  |  'b'  |  'c'  |
|_______|_______|_______|

Looks ok, but... wait. Where's the terminating null character? You chopped it off by explicitly declaring the array's size! Now, how is strlen() supposed to determine the string's length? It will just continue reading characters past the string until a null character is found by matter of coincidence. This is undefined behavior. On the other hand, by doing this:

const char[] my_string = "abc";

You won't be risking doing so. my_string's type will automatically be deduced to const char[ 4 ], and the null character will be preserved.

tl;dr Don't forget the terminating null character!

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for taking the time to break this down with visuals! It made it easier to digest the information!
5

Whenever initializes an array of characters using string literal, do not specify the bound of a string intitialized with a string literal because the compiler will automatically allocate sufficient space for entire string literal,including the terminating null character.

C standard(c11 - 6.7.8 : paragraph 14) says:

An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

char arr[3] = "xyz";

In this example, size of arr is 3, but size of the string literal is 4. string defines one character more(terminating '\0') than the array can hold.

char arr[] = "xyz";

In this example, does not specify the bound of a character array in the array initialize. If the array bound is omitted, the compiler allocates sufficient size to store entire string literal, including null character.

1 Comment

This answer well describes what happens with the 2 approaches, but does not discuss the pros/cons of which is better under various cases.
2

You use 2nd one because in 2nd one if you want to initialize another string which has more than 3 character so it will take care automatically.

Sample Code

int main()
{
    int i;
    char arr[] = "xyz Hello World";
    for(i=0;i<sizeof(arr)-1;i++){
        printf("%c",arr[i]);
    }
    printf("\n");
    return 0;
}

if you use 1st one so when you want to store more than 3 char string it will show warning at compile time

Warning

 warning: initializer-string for array of chars is too long [enabled by default]                                                                                    
     char arr[3] = "xyz Hello World"; 

so you should use 2nd one hat is better way to initialize array of character using string.

1 Comment

@M.M now it will not print the null terminator.
0

Consider also using

const char* arr = "xyz";

It's the same thing (except for the 'const' keyword that makes it so you don't accidentally change the array), but the data won't be copied onto the stack, you use the static copy in the executable's data segment. Especially for large strings, this can be important.

1 Comment

A pointer and an array are syntactically equivalent. If you are not interested in modifying the data in the array, the suggestion above is better than creating an array on the stack and copying data into it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.