9

I read this question and its answer in a book. But I didn't understand the book's justification.

Will the following code compile?

int main()
{
   char str[5] = "fast enough";
   return 0;
}

And the answer was:

Yes.The compiler never detects the error if bounds of an array are exceeded.

I couldn't get it.

Can anybody please explain this?

10
  • 1
    This example is pretty clear, but you should probably specify the language, if not the book. Commented Nov 4, 2009 at 17:45
  • What part are you not understanding? Commented Nov 4, 2009 at 17:45
  • @ mmyers this part -"The compiler never detects the error if bounds of an array are exceeded." Commented Nov 4, 2009 at 17:47
  • 2
    He doesn't understand WHY the compiler doesn't check to see if the "fast enough" string is too long to fit in an array of length 5. Commented Nov 4, 2009 at 17:47
  • Pro Tip: There's no need for the 'programming' tag here. It's redundant. If you're ever tempted to ask a question that couldn't be tagged 'programming', just don't. Commented Nov 4, 2009 at 17:49

6 Answers 6

6

In the C++ standard, 8.5.2/2 Character arrays says:

There shall not be more initializers than there are array elements.

In the C99 standard, 6.7.8/2 Initialization says:

No initializer shall attempt to provide a value for an object not contained within the entity being initialized

C90 6.5.7 Initializers says similar.

However, note that for C (both C90 and C99) the '\0' terminating character will be put in the array if there is room. It's not an error if the terminator will not fit (C99 6.7.8/14: "Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array").

On the other hand, the C++ standard has an example that indicates an error should be diagnosed if there's not room for the terminating character.

in either case, this should be diagnosed as an error in all compilers:

char str[5] = "fast enough";

Maybe pre-ANSI compilers weren't so strict, but any reasonably modern compiler should diagnose this.

Sign up to request clarification or add additional context in comments.

1 Comment

GCC 3.4.5 (compiling as C) only gives a warning (I think it should diagnose this as an error). GCC compiling as C++, MSVC 8, Digital Mars 8.5 and Comeau 4.3.10.1 all produce an error.
5

Your book must be pretty old, because gcc puts out a warning even without -Wall turned on:

$ gcc c.c
c.c: In function `main':
c.c:6: warning: initializer-string for array of chars is too long

If we slightly update the program:

#include <stdio.h>

int main(int argc, char **argv)
{

        char str[5] = "1234567890";
        printf("%s\n", str);
        return 0;
}

We can see that gcc seems to truncate the string to the length you've specified; I'm assuming that there happens to be a '\0' where str[6] would be, because otherwise we should see garbage after the 5; but maybe gcc implicitly makes str an array of length 6 and automatically sticks the '\0' in there - I'm not sure.

$ gcc c.c && ./a.exe
c.c: In function `main':
c.c:6: warning: initializer-string for array of chars is too long
12345

9 Comments

"Will it compile" is generally used to mean "will it compile without errors", not "will it compile without warnings".
"The compiler never detects the error if bounds of an array are exceeded." In this case it does.
It simply does not require a compiler error according to the language specification. The compiler can warn about it, and even treat it as an error if you ask it to treat warnings as errors (-Werror). In case of array initializations it should be pretty trivial for the compiler to check the number of initializers (and it has to do it anyway, because it has to fill the rest of the array with zeros, should there be too few initializers). Basically C is a thin abstraction over assembly, and it is supposed to allow you to make all the "mistakes" you could do at a lower level...
Yes it does; this shows that some compilers do detect that the constant string was too long, and they truncate the string.
@UncleBens: Wrong. It does require a compile error according to the language specification. The language explicitly prohibits supplying more initializers than theres' objects to be initialized. The language only allows dropping the terminating 0 from the string literal. Dropping anything else is explicitly prohibited. The code in the OP must generate an diagnostic message, because it is a constraint violation.
|
2

The answer to the question that you quoted is incorrect. The correct answer is "No. The code will not compile", assuming a formally correct C compiler (as opposed to quirks of some specific compiler).

C language does not allow using an excessively long string literal to initialize a character array of specific size. The only flexibility allowed by the language here is the terminating \0 character. If the array is too short to accommodate the terminating \0, the terminating \0 is silently dropped. But the actual literal string characters cannot be dropped. If the literal is too long, it is a constraint violation and the compiler must issue a diagnostic message.

char s1[5] = "abc"; /* OK */
char s2[5] = "abcd"; /* OK */
char s3[5] = "abcde"; /* OK, zero at the end is dropped (ERROR in C++) */
char s4[5] = "abcdef"; /* ERROR, initializer is too long (ERROR in C++ as well) */

Whoever wrote your "book" did know what they were talking about (at least on this specific subject). What they state in the answer is flat out incorrect.

Note: Supplying excessively long string initializers is illegal in C89/90, C99 and C++. However C++ is even more restrictive in this regard. C++ prohibits dropping the terminating \0 character, while C allows dropping it, as described above.

6 Comments

Where is that constraint mentioned? I don't see it in the section on initializers.
In C89/90 it in 6.5.7: Ther shall be no more initializers in an initializer than there are objects to be initialized. The exception for terminating 0 is given further in the text. I'm still looking for an equivalent in C99...
In C99 it is in the very same place: 6.7.8/2. No initializer shall attempt to provide a value for an object not contained within the entity being initialized.
No, it is a constraint violation in C, both C89/90 and C99. I quited the relevant portions of both standards. The difference with C++ exists, but it's only with regard to terminating 0. C++ does not allow dropping 0.
Note that a diagnostic does not mean that it "will not compile" as you say - the compiler is free to give a warning instead (but the behavior of the resulting program is undefined)
|
0

Array-bound checking happens at runtime, not compile time. The compiler has no way of doing the static analysis of the above code that would be necessary to prevent the error.

UPDATE: Apparently the above statement is true for some compilers and not others. If your book says it will compile, it must be referring to a compiler that doesn't do the checking.

3 Comments

Sure it can - this is an initializer known to the compiler at compile time. It's just not required to error out.
Well, in C, array bounds checking doesn't happen at all. And obviously the compiler can do the static analysis, since gcc would give you a warning (as pointed out by another poster)
C doesn't do array bounds checking, hence this compiles. The compiler could do static checking but it's still valid C.
0

Because "fast enough" simply a pointer to a null terminated string. It's too much work for the compiler to figure out if ever assignment to a char* or char [] is going to go beyond the bounds of the array.

2 Comments

This can't be the reason. The compiler has to check the number of initializers anyway to allow things like char s[] = "Hello" and int array[10] = {2}
There is a difference between "Hello" and {'H','e','l','l','o','\0'} One is an array and one is simply a pointer. The pointer could point to anywhere but the array is constant at compile time.
0

What's happening is you're trying to initialize a character array with more characters than the array has room for. Here's how it breaks down:

char str[5];

Declares a character array with five characters.

char str[5] = "fast enough";

The second part '= "fast enough";' then attempts to initialize that array with the value "fast enough". This will not work, because "fast enough" is longer than the array is.

It will, however, compile. C and C++ compilers can't generally perform bounds checking on arrays for you, and overrunning an array is one of the most common reasons for segmentation faults. [edit]As Mark Rushakoff pointed out, apparently the newer ones do throw warnings, for some cases.[/edit] This may segfault when you try to run it, more likely I think the array will simply be initialized to "fast ".

4 Comments

No, it won't "compile". it is a contraint violation, which requires a diagnostic message. Comeau, a very pedantic compiler, will refuse to compile this with an error message. GCC, on the other hand, opts for a mere warning. From the language point of view this is a constraint violation, i.e. in simple terms it is an error.
Harsh Andrey. Obviously it will compile - SOME OF THE TIME. It does in GCC, it does in the original posters compiler. It will in MANY compilers. And if he doesn't have -Wall turned on it won't even give a warning. Maybe it is a violation and maybe some compilers will catch it. But my answer is not wrong in saying that it will generally compile. And my assertion that in most cases compilers can't and don't perform bounds checking is also true. At initialization is one of the few cases where they can, and as mentioned in the edit, some do.
@Alcon: No, actually, it seems to fail with an error in most (if not virtually all) compilers. In fact, GCC is the only exception from that rule that I know of so far. And no, you don't need -Wall in GCC to activate this warning. This warning is issued by default. Additionally, this has nothing to do with bounds checking. This is just a matter of supplying to many initializers in aggregate initialization. I'm sure all compilers without a single exception will issue an error if you do int a[1] = { 1, 2 }. The string literal is not really different from that.
Actually, even int a[1] = { 1, 2 } is just a warning in GCC. Even if you do that with a struct it is just a warning in GCC. There must be a reason for this, most likely a historical one (they might want to suport some crappy legacy code). But the fact that this is still a warning even in -pedantic-errors mode really means that this is a bug in GCC. GCC needs to fix it, the current behavior is hardly acceptable, especially in -pedantic-errors mode.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.