4

For a school project I was asked to write a simple math parser in Java. The program works fine. So fine that I used NetBeans profiler tool to check the performance of the program. For that I made a loop of 1000 calls to the math parser of the following expression: "1-((x+1)+1)*2", where x was replaced by the current loop count. It took 262ms. The thing is, it took 50% of the time in the method splitFormula, which I shall present below:

private static void splitFormula(String formula){
    partialFormula=new ArrayList<>();

    for(String temp: formula.split("\\+|\\-|\\*|\\/"))
        partialFormula.add(temp);
}

, where partialFormula is an ArrayList of Strings. To numerically evaluate an expression I need to call the splitFormula method various times so I really need to clear the contents of the partialFormula ArrayList - first line.

My question is: is there a faster way to split a string then add the partial strings to the an arraylist? Or is there some other method that can be used to split a string then use the substrings?

12
  • ArrayLists begin with certain size and keeps growing. For example if the size of array list is 10 and you add 11th element, then a new array list of say size 20 is allocated, and all teh elements of the old array lists are copied into new one and so on. Try linked list instead. Commented Apr 14, 2014 at 23:18
  • @MohanKumar I didn't know that. Let me check your suggestion. Thank you. Commented Apr 14, 2014 at 23:20
  • I was wondering. Isnt better idea to use partialFormula.clear() instead of "clearing" it the way mentioned above ? Commented Apr 14, 2014 at 23:22
  • I think you need to parse the formula manually, by using Pattern and Matcher. Commented Apr 14, 2014 at 23:23
  • 4
    @MohanKumar A linked list adds indirection, arrays do not. The growing operation is almost always going to be performed on cache, which is WAY faster than the overhead introduced by linked list indirections Commented Apr 15, 2014 at 0:35

4 Answers 4

8

Regular expressions can slow things down (String#split uses regex). In general, if you want to write easy code, regex is good, but if you want fast code, see if there is another way. Try doing this without regex:

Edit: This should be a better method (keep track of the indices instead of append to a StringBuilder):

private static void splitFormula(String formula){
    partialFormula.clear(); // since there is a method for this, why not use it?

    int lastIndex = 0;
    for (int index = 0; index < formula.length(); index++) {
        char c = formula.charAt(index);
        if (c == '-' || c == '+' || c == '*' || c == '/') {
            partialFormula.add(formula.substring(lastIndex, index));
            lastIndex = index + 1; //because if it were index, it would include the operator
        }
    }
    partialFormula.add(formula.substring(lastIndex));
}

StringBuilder approach:

private static void splitFormula(String formula){
    partialFormula.clear();

    StringBuilder newStr = new StringBuilder();

    for (int index = 0; index < formula.length(); index++) {
        char c = formula.charAt(index);
        if (c == '-' || c == '+' || c == '*' || c == '/') {
            partialFormula.add(newStr.toString());
            newStr.setLength(0);
        } else {
            newStr.append(c);
        }
    }
    partialFormula.add(newStr.toString());
}

If we look at the source code for String#split, it becomes apparent why that is slower (from GrepCode):

public String[] split(String regex, int limit) {
    return Pattern.compile(regex).split(this, limit);
}

It compiles a regex every time! Thus, we can see that another way of speeding up the code is to compile our regex first, then use the Pattern#split to split:

//In constructor, or as a static variable.
//This regex is a better form of yours.
Pattern operatorPattern = Pattern.compile("[-*+/]");
...
private static void splitFormula(String formula){
    partialFormula.clear();

    for(String temp: operatorPattern.split(formula)) {
        partialFormula.add(temp);
    }
}
Sign up to request clarification or add additional context in comments.

6 Comments

There's a flaw in your code. I tried it with the expression 1+2 and your code as it stands only accounts for the first element: "1". But I honestly don't see the problem of your code =/
@PML Oh oops. Forgot to add for the end of the string :-)
It works. It's just missing a ')' in the last statement. Well I can't thank you enough for your time and the explanation. The whole program now only takes, 134ms. The method itself takes now 13.1ms. Thank you so so much!
In "partialFormula.add(formula.substring(lastIndex);" There's a ')' missing to close the .add method =)
@PML The perils of not using an IDE to answer StackOverflow questions.
|
0

You don't need a for loop. split returns an array, and you can create an ArrayList out of the array:

partialFormula = new ArrayList<>(Arrays.asList(formula.split("\\+|\\-|\\*|\\/")));

Whether this is significantly faster or not, I don't know.

4 Comments

Let me quickly check your suggestion =)
Sadly it isn't. It takes much longer. The method took almost 100ms more than the loop =/
Simpler regex: [-+*/].
@PML Well, darn. Sorry.
0

Try pre-allocating the ArrayList beforehand so we do not have to pay for reallocation when the list grows. The number 20 below is just a placeholder. Pick a number that is a little bigger than the largest expression you expect.

partialFormula=new ArrayList<String>(20);

See this question for a discussion of what this might gain you.

Comments

0

This will create an arrayList of strings.

String a= "1234+af/d53";
    char [] blah=a.toCharArray(); 
    ArrayList<String> list=new ArrayList<String>();
    for (int i = 0; i < blah.length; i++) {
        list.add(Character.toString(blah[i]));  
    }

3 Comments

Yes, it will create an array list of strings. But not how the OP wants it. Your example string would be the list 1,2,3,4,+,a,f,?,d,5,3 when OP would want 1234,af,d53
I misunderstood. It should be String [] blah=a.split("\\+|\\-|\*|\\/"); Then just use list.add(blah[i]);
Which is exactly what the OP had but with more lines of code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.