6

I am trying to make a function graphing program in Java, and it involves taking a user's input for the function which will be graphed, parsing it, and graphing it. For example, the user might enter in x^2 - y^2, cos(x + y), log(x) - sqrt(y), etc. The program makes use of both infix binary operations (+, -, etc.) and unary operations (cos, sqrt, etc.).

In short, in order to evaluate the unary operations, I must make sure that a given expression follows the format of a single unary operation. For example, cos(x), sqrt(x + y), and log(exp(y) - x) all fit this format, since they are unary operations with some expression as their operand; however, strings such as sin(x)*cos(y) and 1 + log(x) do not follow this format. In order to check, I made a regular expression for this format:

String unaryName = "((productlog)|(zeta)|(log)|(sqrt)|(cos)|(sin)|(tan)|(sec)|(csc)|(csc)|(abs)|(arccos)|(arcsin)|(arctan)|(arcsec)|(arccsc)|(arccot)|(gamma)|(exp))";

(this is just a regex to check if a given string is a name for a predefined unary operation)

String unaryOperation = unaryName + "\\(([^\\(\\)]*(\\(.*\\))*[^\\(\\)]*)+\\)"

I'll give an explanation. This regex is looking for the name of one of the unary operations. After that, it looks for a left-parenthesis. After that, it looks for some sequence of characters which are not parentheses, and then some sequence which begins with a left parenthesis and ends with a right parenthesis. The latter prevents a String such as "sin(x) + cos(y)" from matching.

This regular expression always gives desired results, as far as I can tell. However, in its use, one problem arises. Consider this situation:

String s = "cos(3) + sin(4)";
System.out.println(s.matches(unaryOperation));

Obviously, if the regex works, this should return false, which it does. The same would be true with this example:

String s = "cos(3.000) + sin(4)";
System.out.println(s.matches(unaryOperation));

Nothing really changed, pattern-wise. However, successively adding zeroes to the 3, the match seems to take exponentially longer to evaluate. For me, 12 zeroes takes about 13 seconds. Since my program will be plotting many points on a graph, it will have to calculate thousands of expressions every time it graphs something, so this is a fatal flaw.

I've already found a way around having to use this regular expression and my program works quite nicely, but I would still like to know: why does this regular expression take so long to work for large inputs, and is there any way to change the regex to fix this issue?

1
  • 1
    Why are you parsing expressions with regular expressions? Commented Jan 3, 2013 at 3:54

2 Answers 2

1

You can use this regex

unaryName+"\\([^)]*(\\([^()]*\\))?[^(]*\\)"
                    ------------
                         |->starting from center.

Here I am checking whether the round brackets are properly balanced ..That should solve your problem!

Sign up to request clarification or add additional context in comments.

2 Comments

You don't need the anchors when using String.matches.
Thanks! My only qualm was that your regex didn't match cos(x), or any unary operation without nested parentheses, but it was easy to fix: unaryName+"\([^)]*(\([^()]*\))*[^(]*\)$"
0

I suspect the problem is that your expression is doing a lot of backtracking because of the .* in the middle of the pattern. Try replacing it with a reluctant quantifier: .*? or, better yet (if I understand the logic), with [^\\)]*.

Actually, wouldn't this do the trick:

String unaryOperation = unaryName + "\\([^\\)]*\\)";

This looks for a name, a left parenthesis, any number of non-right-parenthesis characters, and then a right parenthesis. This assumes that you don't want to match things like

"cos(3 * (4 + x))"

(which your pattern also won't match).

4 Comments

I did that, and it still takes 10 seconds. A minor improvement, but still not sufficient. EDIT- Also tried the second suggestion, didn't work either.
I do want to match things like cos(3 + (4 + x)), and I do believe my original regex does match that.
@MikeB - I stand corrected; your original does match nested parens. However, it also matches unbalanced parens (such as "cos(3 * (4 + 5)))" or "cos(3 * (4 - sin(6 - 2))"). That's probably not what you want. (Regex can't be used to match parentheses to arbitrary depth, and the complexity of matching to bounded depth grows exponentially with the depth.)
That's not an issue, if there are any mismatched parentheses then the program would have caught them beforehand. When the program is checking if the expression is a unary operation, you can assume the expression is syntactically valid.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.