4

I know that the correct way to compare "strings" in C is by using strcmp, but now I tried comparing some character arrays with the == operator, and got some strange results.

Take a look at the following code:

int main()
{
    char *s1 = "Andreas";
    char *s2 = "Andreas";

    char s3[] = "Andreas";
    char s4[] = "Andreas";

    char *s5 = "Hello";

    printf("%d\n", s1 == s2); //1
    printf("%d\n", s3 == s4); //0
    printf("%d\n", s1 == s5); //0
}

The first printf correctly prints a 1, which signals that they are not equal. But can someone explain to me why, when comparing the character arrays, the == is returning a 0 ?

Can someone please explain to me why the first printf is returning a 1 (ie, they are equal) and the character arrays are returning a 0 ?

3
  • You should explicitly cast the result of the comparison to an int. Since you are using %d, printf will be looking for the size of an int (often 4 bytes) on the stack, but the bool is usually 1 byte. Commented Nov 26, 2009 at 0:17
  • Bipedal: C's promotion rules for vararg functions handle this just fine, and what makes you think == results in a bool instead of an int anyway? :P Commented Nov 26, 2009 at 0:25
  • According to rules for vararg conversions, bool will be implicitly promoted to int when used as a vararg argument - see 5.2.2[expr.call]/7 and 4.5[conv.prom]/4. Commented Nov 26, 2009 at 0:28

7 Answers 7

20

The == is comparing the memory address.
It's likely that your compiler is making s1 and s2 point to the same static data to save space.

ie. The "Andreas" in the first two lines of code is stored in your executable data. The C standard says these strings are constant and so has optomized the two pointers to point to the same storage.

The char[] lines create a variable by copying the data into the variable and so are stored at different address on the stack during execution.

Sign up to request clarification or add additional context in comments.

7 Comments

so this is all due to compiler optimization?
I'm pretty sure it's a requirement of the standard that identical string literals in the same translation unit ("file", in standards-speak) are required to have the same storage location.
Depending on architecture, those first 2 string constants could be in write protected memory and thus immutable. It would make sense to cut down on space requirements by creating the constant just once.
"optimization" would stretch the word a bit, but yeah.
Andy: Not required, 6.4.5/6 (C99): "It is unspecified whether these arrays are distinct ..."
|
4

Uh... when == prints a 1, it means they are equal. It's different from strcmp, which returns the relative order of the strings.

Comments

2

You are comparing addresses and not the strings. The first two are constant and will only be created once.

int main()
{
    char *s1 = "Andreas";
    char *s2 = "Andreas";

    char s3[] = "Andreas";
    char s4[] = "Andreas";

    char *s5 = "Hello";

    printf("%d\n", s1 == s2); //1
    printf("%p == %p\n", s1, s2);
    printf("%d\n", s3 == s4); //0
    printf("%p != %p\n", s3, s4);
    printf("%d\n", s1 == s5); //0
    printf("%p != %p\n", s1, s5);
}

Output on my computer, but you get the idea:

1
0x1fd8 == 0x1fd8
0
0xbffff8fc != 0xbffff8f4
0
0x1fd8 != 0x1fe0

Comments

1

Wait a sec... 1 means true, 0 means false. So your explanation is partially backwards. As for why the first two strings seem to be equal: The compiler built that constant string (s1/2) just once.

1 Comment

oh yea, ure actually right; I still had strcmp in my head, and I inverted the values!
1

s1 == s2 means "(char*) == (char*)" or that the addresses are the same.

Same thing for s3 == s4. That's the "arrays decay into pointers" at work.

And you have the meaning of the result of the comparison wrong:

0 == 0; /* true; 1 */
42 == 0; /* false; 0 */
"foo" == "bar"; /* false (the addresses are different); 0 */

Comments

1

All the values from s1 through s5 aren't char themselves, they're pointers to char. So what you're comparing is the memory addresses of each string, rather than the strings themselves.

If you display the addresses thus, you can see what the comparison operators are actually working on:

#include <stdio.h>

int main() {
  char *s1 = "Andreas";
  char *s2 = "Andreas";

  char s3[] = "Andreas";
  char s4[] = "Andreas";

  char *s5 = "Hello";

  printf("%p\n", s1); // 0x80484d0
  printf("%p\n", s2); // 0x80484d0
  printf("%p\n", s3); // 0xbfef9280
  printf("%p\n", s4); // 0xbfef9278
  printf("%p\n", s5); // 0x80484d8
}

Exactly where the strings are allocated in memory is implementation specific. In this case, the s1 and s2 are pointing to the same static memory block, but I wouldn't expect that behaviour to be portable.

1 Comment

Yeah that's a string pool optimization. I wouldn't count on it.
0

You can't compare strings, but you can compare pointers.

1 Comment

Yes, I know. :-) I meant with the comparison operators. :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.