2

I understand that inside a method:

String myStr1 = "good";
String myStr2 = "good";
System.out.println(myStr1==myStr2);

Prints true. For the same reason:

String myStr1 = "good";
String myStr2 = ""+'g'+'o'+'o'+'d';
System.out.println(myStr1==myStr2);

Prints also true.

Then why:

String myStr1 = "good";
char[] myCharArr = {'g', 'o', 'o', 'd' };
String myStr2 = ""+myCharArr[0]+myCharArr[1]+myCharArr[2]+myCharArr[3];
System.out.println(myStr1==myStr2);

Prints false? I don't see the difference between the two last codes. Any idea? Thanks.

5
  • 1
    The first two examples are concatenated at compile time, and resolve to the same object. The last example is concatenated at run time, and usually creates a new unique String object. Commented Nov 29, 2015 at 20:36
  • I see the point. Tks. Commented Nov 29, 2015 at 20:40
  • 1
    stackoverflow.com/questions/513832/… is a nice page to bookmark for this :) Commented Nov 29, 2015 at 20:45
  • I'm guessing you're aware that Strings are objects and value comparison of them requires myStr1.equals(myStr2). == does reference comparison on objects and I'm guessing your question is about how Java decides to use the same objects or not at compile time. Commented Nov 29, 2015 at 20:49
  • Hi Alain. Your first guessing is right; I'm well aware of it. For second gessing I had not considered that using the same string object for different references had to be done in the compilation, thus the third code would always generate a new object as it's generated during runtime. It's a doubt that arose after a question while preparing for the OCA test. Commented Nov 30, 2015 at 12:34

4 Answers 4

3

The compiler replaces multiple value-equal Strings built from Constant Expressions like this:

String myStr1 = "good";
String myStr2 = ""+'g'+'o'+'o'+'d';
System.out.println(myStr1==myStr2);

With a unique String object obtained from String.intern. That unique String object is then assigned to both variables. This is why they are then reference equal.

String myStr1 = "good";
char[] myCharArr = {'g', 'o', 'o', 'd' };
String myStr2 = ""+myCharArr[0]+myCharArr[1]+myCharArr[2]+myCharArr[3];
System.out.println(myStr1==myStr2);

The compiler cannot optimize this because it has an array reference which is not a Constant Expression. This results in two separate String objects which are not reference equal. It would violate the Java Language Specification to do otherwise.

Here is the definition of a Constant Expression from the Java Language Specification:

A constant expression is an expression denoting a value of primitive type or a String that does not complete abruptly and is composed using only the following:

  • Literals of primitive type and literals of type String (§3.10.1, §3.10.2, §3.10.3, §3.10.4, §3.10.5)

  • Casts to primitive types and casts to type String (§15.16)

  • The unary operators +, -, ~, and ! (but not ++ or --) (§15.15.3, §15.15.4, §15.15.5, §15.15.6)

  • The multiplicative operators *, /, and % (§15.17)

  • The additive operators + and - (§15.18)

  • The shift operators <<, >>, and >>> (§15.19)

  • The relational operators <, <=, >, and >= (but not instanceof) (§15.20)

  • The equality operators == and != (§15.21)

  • The bitwise and logical operators &, ^, and | (§15.22)

  • The conditional-and operator && and the conditional-or operator || (§15.23, §15.24)

  • The ternary conditional operator ? : (§15.25)

  • Parenthesized expressions (§15.8.5) whose contained expression is a constant expression.

  • Simple names (§6.5.6.1) that refer to constant variables (§4.12.4).

  • Qualified names (§6.5.6.2) of the form TypeName . Identifier that refer to constant variables (§4.12.4).

Constant expressions of type String are always "interned" so as to share unique instances, using the method String.intern.

A constant expression is always treated as FP-strict (§15.4), even if it occurs in a context where a non-constant expression would not be considered to be FP-strict.

SOURCE: http://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#d5e30892

Sign up to request clarification or add additional context in comments.

2 Comments

Note also that even if myCharArray were declared final, it would still not be a constant expression.
@KlitosKyriacou yes, very important. Constant Expression as I've used it refers to a specific and constrained definition in the language specification not all semantically constant expressions :)
3

myCharArr[0] can't be evaluated at compilation time since compiler (who's cleverness is limiter) thinks that it may be possible that at runtime before string will be concatenated this array may be edited (maybe by some other thread) which means its content can change so it doesn't assume that for instance myCharArr[0] should be 'g' (maybe in the future this behavior will be improved).

So while with code like ""+'g'+'o'+'o'+'d' compiler is sure about values it handles, it can figure out that result string will be "good" (since we used compile-time-constants) so to optimize our code and preventing recalculating this expression each time we run our code it simply replaces ""+'g'+'o'+'o'+'d' with "good".
But since it can't evaluate this expression for myCharArr[0] it can't optimize our code same way, which means it will need to leave creation of this string to code executed at runtime.


Now If you are wondering why == returns true for "good"=="good" but false for code like "good"==new String("good") you need to know that:

  • == compares references, in other words it lets us test if we are comparing references storing same objects (if you want to check if objects are equal use equal method)
  • Java has String Pool which stores literals to avoid recreating many String objects storing same data and compiler adds code responsible for placing and retrieving literal from that pool, so when you do "good"=="good" both literals are same object from that pool which true confirms
  • but compiler doesn't add code responsible for placing into pool or retrieving from it String created at runtime explicitly by using new Sring(data) constructor to prevent in pool strings which most probably will not be rereated ever again so with "good"==new String("good") you are comparing two different objects, "good" from pool, and new String(...) which is separate than one from pool (which confirms result false of ==).

3 Comments

This is a little hand-wavy without a reference to constant expressions and the language specification. It isn't based on the possibility of arrays changing. Technically speaking static analysis could rule that out, so it really is the JLS definition of Constant Expressions that matters here.
"static analysis could rule that out, so it really is the JLS definition of Constant Expressions that matters here" that is very true. Java compiler already handles nicely concatenation of compilation-constants stored in final variables so we may hope for other compiler improvements in the future. Anyway updated my answer a little. I wasn't trying to create answer based on "because JLS says so". I tried to focus on possible problem/reason which could show why JLS says so.
that makes sense. This is one area where a future language spec could expand the definition of Constant Expression. That will expose some fun bugs when it happens ;)
1

Only compile-time constant Strings are automatically interned. What is considered a constant string is described (in general for constant expressions) in the Oracle documentation. By that definition, your char array is not constant, and therefore the expression that uses it will create a new String object.

Comments

1

The following statement

String myStr2 = ""+myCharArr[0]+myCharArr[1]+myCharArr[2]+myCharArr[3];

Will be compiled to the following:

  1. StringBuilder sb = new StringBuilder()
  2. sb.append(myCharArr[0]) ... sb.append(myCharArr[3])
  3. and then calls sb.toString() which returns a new String

Decompile the byte-code and you will see something like this

  28: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
  31: ldc           #4                  // String
  33: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
  36: aload_1
  37: iconst_0
  38: caload
  39: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
  42: aload_1
  43: iconst_1
  44: caload
  45: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
  48: aload_1
  49: iconst_2
  50: caload
  51: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
  54: aload_1
  55: iconst_3
  56: caload
  57: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
  60: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;

Where the other statements

String myStr1 = "good";
String myStr2 = ""+'g'+'o'+'o'+'d';

Declares two constant strings, and here's the byte-code

   0: ldc           #2                  // String good
   2: astore_1
   3: ldc           #2                  // String good
   5: astore_2

The compiler declares them as constants straight away.

5 Comments

Cool answer. Can you show the bytecode for the other case where they come out equal?
There's bound to be something in the Java Language Specification about this. Without a clear answer the outcomes would be compiler implementation dependent which is terrifying.
@AlainO'Dea I've just edited my answer, I am looking for the explanation in the specs, but no luck so far :)
take a look at my answer. Feel free to incorporate it if you like it.
Thank you for the other bytecode. That's very informative!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.