1

I have a concern regarding, how java compare the String and how optimized it is.

Let's consider I require to compare user provided String, with 10 Strings in my code. Out of those 10 Strings 5 are starting with 'A' and other 5 are starting with 'B'.

Now if I write if..elseif.. or switch, then each string will be compared until it matches with any String, and under worst case scenario, where either String is getting matched in 9th or 10th condition or not at all. So average 8 to 10 conditions are executing for each input.

Now my question is can we optimize this situation by putting one more condition (kind of filter) before checking the input string with actual value. Like below,

   if(inputString.charAt(0) == 'A'){
            if(){
                ....
            }else if(){
                ....
            }
        }else if(inputString.charAt(0) == 'B'){
            if(){
                ...
            }else if(){
                ...
            }
        }

Can it improve the performance of the system or Java is already internally optimized for this kind of situation.

7
  • 1
    Put the ten strings in an array, compare with that in loop, break if true... Commented May 11, 2015 at 10:19
  • @TonyHopkinson, but that will only reduce the line of code, but conditions being matched will be same in both cases. Commented May 11, 2015 at 10:21
  • If you look at implementation of String#equals method, you'll notice that str1.equals(str2) is the most efficient way to do what you need without writing compilcated/hard to understand code like you posted. Commented May 11, 2015 at 10:27
  • 2
    @Vicky - If you are looking for a better algorithm, then there isn't a simple one that does well than n X m where n is 10 and m is length of string that is being searched, That's the worst case. BTW equals() will return on the first unmatched character. So that's not an issue. If you ask me, then ~10 strings is too less a number , even to do micro-benchmarking. You shouldn't worry about performance in such trivial cases. Commented May 11, 2015 at 10:27
  • Reducing the LOC is desirable in and of itself. I daresay the parsing cost is a significant proportion of the search cost for a mere ten strings. On top of that if you use Array.Find you'll give the JVM a chance at optimisation. this it's unlikely to touch. Commented May 11, 2015 at 10:27

3 Answers 3

1

The if statement

Discussions usually begin surrounding complex if statements such as this:

if (value == 0){
    return result0;
} else if (value == 1){
    return result1;
} else if (value == 2){
    return result2;
} else if (value == 3){
    return result3;
} else if (value == 4){
    return result4;
} else if (value == 5){
    return result5;
} else if (value == 6){
    return result6;
} else if (value == 7){
    return result7;
} else if (value == 8){
    return result8;
} else if (value == 9){
    return result9;
} else {
    return result10;
}

Typically, this type of construct is frowned upon. The major problem is that the deeper into the statement the execution flows, the more conditions have to be evaluated. It will take longer to complete the execution when value is 9 than if value is 0 because every other condition must be evaluated beforehand. As the overall number of conditions increases, so does the performance hit for going deep into the conditions. While having a large number of if conditions isn’t advisable, there are steps you can take to increase the overall performance.

The first step is to arrange the conditions in decreasing order of frequency. Since exiting after the first condition is the fastest operation, you want to make sure that happens as often as possible. Suppose the most common case in the previous example is that value will equal 5 and the second most common is that value will equal 9. In that case, you know five conditions will be evaluated before getting to the most common case and nine before getting to the second most common case; this is incredibly inefficient. Even though the increasing numeric order of the conditions makes it easier to read, it should actually be rewritten as follows:

if (value == 5){

    return result5;
} else if (value == 9){
    return result9;
} else if (value == 0){
    return result0;
} else if (value == 1){
    return result1;
} else if (value == 2){
    return result2;
} else if (value == 3){
    return result3;
} else if (value == 4){
    return result4;
} else if (value == 6){
    return result6;
} else if (value == 7){
    return result7;
} else if (value == 8){
    return result8;
} else {
    return result10;
}

Now the two most common conditions appear at the top of the if statement, ensuring optimal performance for these cases.

Another way to optimize if statements is to organize the conditions into a series of branches, following a binary search algorithm to find the valid condition. This is advisable in the case where a large number of conditions are possible and no one or two will occur with a high enough rate to simply order according to frequency. The goal is to minimize the number of conditions to be evaluated for as many of the conditions as possible. If all of the conditions for value in the example will occur with the same relative frequency, the if statements can be rewritten as follows:

if (value < 6){

    if (value < 3){
        if (value == 0){
            return result0;
        } else if (value == 1){
            return result1;
        } else {
            return result2;
        }
    } else {
        if (value == 3){
            return result3;
        } else if (value == 4){
            return result4;
        } else {
            return result5;
        }
    }

} else {

    if (value < 8){
        if (value == 6){
            return result6;
        } else {
            return result7;
        }
    } else {
        if (value == 8){
            return result8;
        } else if (value == 9){
            return result9;
        } else {
            return result10;
        }

    }
}

This code ensures that there will never be any more than four conditions evaluated. Instead of evaluating each condition to find the right value, the conditions are separated first into a series of ranges before identifying the actual value. The overall performance of this example is improved because the cases where eight and nine conditions need to be evaluated have been removed. The maximum number of condition evaluations is now four, creating an average savings of about 30% off the execution time of the previous version. Keep in mind, also, that an else statement has no condition to evaluate. However, the problem remains that each additional condition ends up taking more time to execute, affecting not only the performance but also the maintainability of this code. This is where the switch statement comes in.

The switch statement

The switch statement simplifies both the appearance and the performance of multiple conditions. You can rewrite the previous example using a switch statement as follows:

switch(value){
    case 0:
        return result0;
    case 1:
        return result1;
    case 2:
        return result2;
    case 3:
        return result3;
    case 4:
        return result4;
    case 5:
        return result5;
    case 6:
        return result6;
    case 7:
        return result7;
    case 8:
        return result8;
    case 9:
        return result9;
    default:
        return result10;
}

This code clearly indicates the conditions as well as the return values in an arguably more readable form. The switch statement has the added benefit of allowing fall-through conditions, which allow you to specify the same result for a number of different values without creating complex nested conditions. The switch statement is often cited in other programming languages as the hands-down better option for evaluating multiple conditions. This isn’t because of the nature of the switch statement, but rather because of how compilers are able to optimize switch statements for faster evaluation.

Another option: Array lookup

There are more than two solutions for dealing with conditionals in JavaScript. Alongside the if statement and the switch statement is a third approach: looking up values in arrays

for(String s: arr){
    if(s.equals(targetValue))
        return true;
}
return false;

Although array lookup times also increase the deeper into the array you go, the incremental increase is so small that it is irrelevant relative to the increases in each condition evaluation for if and switch statements. This makes array lookup ideal whenever there are a large number of conditions to be met, and the conditions can be represented by discrete values such as numbers or strings

The fastest conditionals

The three techniques presented here—the if statement, the switch statement, and array lookup—each have their uses in optimizing code execution:

•Use the if statement when:

  • There are no more than two discrete values for which to test.
  • There are a large number of values that can be easily separated into ranges.

•Use the switch statement when:

  • There are more than two but fewer than 10 discrete values for which to test.
  • There are no ranges for conditions because the values are nonlinear.

•Use array lookup when:

  • There are more than 10 values for which to test.
  • The results of the conditions are single values rather than a number of actions to be taken.
Sign up to request clarification or add additional context in comments.

Comments

0

You could use a lookup from a Map or Set instead of a loop through each value.

3 Comments

What about time taken for creation of Map / Set?. That's not a good solution?
That just depends on the nature of the "10 Strings in my code". Are they static?
Frankly speaking - nothing can beat array access in time-complexity. Frankly speaking using a Set or a Map aren't efficient at all. Imagine the time taken for all the stuff that has to be supported which is not necessary. And worst case access time of HashMap isn't O(1) either.
0

Java compare Strings character by character. With your "optimization" u are comparing one more char than needed each time. You start comparing the first character (A or B), and then the whole word including again the first character.

Your solutions its nice, because reduces the number of whole word comparations. If you cut the word in 2 parts (the first letter and the rest of the word) u can compare the first char at the start, just like u are doing... and then the other part. The problem is probably you will spend more time slicing the word than comparing.... so check it ;)

And one more thing... if you use a switch, the time spent on the search does not depends on the number of elements... so the first optimization is not needed if u use a switch.

Also u can make a global HashMap to find your string inside it... so the "map creation" of the switch is avoided...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.