I am considering submitting an RFE (request for enhancement) to Oracle Bug Database which is supposed to significantly increase string concatenation performance. But before I do it I'd like to hear experts' comments on whether it makes sense.
The idea is based on the fact that the existing String.concat(String) works two times faster on 2 strings than StringBuilder. The problem is that there is no method to concatenate 3 or more strings. External methods cannot do this because String.concat uses a package private constructor String(int offset, int count, char[] value) which does not copy the char array but uses it directly. This ensure high String.concat performance. Being in the same package StringBuilder still cannot use this constructor because then the String's char array will be exposed for modifications.
I suggest to add the following methods to String
public static String concat(String s1, String s2)
public static String concat(String s1, String s2, String s3)
public static String concat(String s1, String s2, String s3, String s4)
public static String concat(String s1, String s2, String s3, String s4, String s5)
public static String concat(String s1, String... array)
Note: this kind of overloading is used in EnumSet.of, for efficiency.
This is the implementation of one of the methods, others work the same way
public final class String {
private final char value[];
private final int count;
private final int offset;
String(int offset, int count, char value[]) {
this.value = value;
this.offset = offset;
this.count = count;
}
public static String concat(String s1, String s2, String s3) {
char buf[] = new char[s1.count + s2.count + s3.count];
System.arraycopy(s1.value, s1.offset, buf, 0, s1.count);
System.arraycopy(s2.value, s2.offset, buf, s1.count, s2.count);
System.arraycopy(s3.value, s3.offset, buf, s1.count + s2.count, s3.count);
return new String(0, buf.length, buf);
}
Also, after these methods are added to String, Java compiler for
String s = s1 + s2 + s3;
will be able to build efficient
String s = String.concat(s1, s2, s3);
instead of current inefficient
String s = (new StringBuilder(String.valueOf(s1))).append(s2).append(s3).toString();
UPDATE Performance test. I ran it on my notebook Intel Celeron 925, concatenation of 3 strings, my String2 class emulates exactly how it would be in real java.lang.String. String lengths are chosen so that to put StringBuilder in the most unfavourable conditions, that is when it needs to expand its internal buffer capacity on each append, while concat always creates char[] only once.
public class String2 {
private final char value[];
private final int count;
private final int offset;
String2(String s) {
value = s.toCharArray();
offset = 0;
count = value.length;
}
String2(int offset, int count, char value[]) {
this.value = value;
this.offset = offset;
this.count = count;
}
public static String2 concat(String2 s1, String2 s2, String2 s3) {
char buf[] = new char[s1.count + s2.count + s3.count];
System.arraycopy(s1.value, s1.offset, buf, 0, s1.count);
System.arraycopy(s2.value, s2.offset, buf, s1.count, s2.count);
System.arraycopy(s3.value, s3.offset, buf, s1.count + s2.count, s3.count);
return new String2(0, buf.length, buf);
}
public static void main(String[] args) {
String s1 = "1";
String s2 = "11111111111111111";
String s3 = "11111111111111111111111111111111111111111";
String2 s21 = new String2(s1);
String2 s22 = new String2(s2);
String2 s23 = new String2(s3);
long t0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; i++) {
String2 s = String2.concat(s21, s22, s23);
// String s = new StringBuilder(s1).append(s2).append(s3).toString();
}
System.out.println(System.currentTimeMillis() - t0);
}
}
on 1,000,000 iterations the results are:
version 1 = ~200 ms
version 2 = ~400 ms