0

I want to match alphanumeric words separated by the operators, +, -, *, /, <, > and ending with a semicolon. There can be whitespace characters in between e.g. the following strings should return true:

first + second;
first - second;
first * second;
first / second;
first;
first + second < third;
third < second * first;

This is what I have tried:

public boolean isExpr(String line) {
    // factor = ([A-Za-Z]+|[0-9]+)   for example: aasdaa or 23131 or xyz or 1 or a
    // simple-expr = (factor {mulop factor} {addop factor {mulop factor}})
    // expr = simple-expr compop simple-expr | simple-expr

    String factor = new String("([A-Za-Z]+|[0-9]+)");
    String mulOp = new String("(\\*|\\/)");  // '*'' or '/'
    String addOp = new String("(\\+|\\-)");  // '+' or '-'
    String compOp = new String("(\\<|\\=");  // '<' or '='
    String simpleExpr = new String("(" + factor + " (" + mulOp + " " + factor + ")? (" + addOp + " " + factor + " (" + mulOp + " " + factor + ")?)?");
    String expr = new String("(" + simpleExpr + " " + compOp + " " + simpleExpr + ")|" + simpleExpr);

    System.out.println(line.matches(expr));

    return line.matches(expr);
}

What is wrong with that code and how can I solve it?

I got the below error on executing my code:

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal character range near index 9
((([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))? ((\+|\-) ([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))?)? (\<|\= (([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-eZ]+|[0-9]+))? ((\+|\-) ([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))?)?)|(([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0
-9]+))? ((\+|\-) ([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))?)?
9
  • 2
    A-Za-Z should be A-Za-z Commented Jun 19, 2021 at 15:11
  • I updated, still gives error. I'm editing the post. Can you check it again? Commented Jun 19, 2021 at 15:12
  • 3
    Doing things like new String"([A-Za-z]+|[0-9]+)") makes no sense at all, "([A-Za-z]+|[0-9]+)" already is a string, you don't need create a new one, and doing so is a waste of time and memory. The same goes for all you other uses of new String(...). The only constructor of String you should ever need to call is one that takes a byte array and a character set. Commented Jun 19, 2021 at 15:17
  • 1
    Count parenthesis and you should easily find place where it doesn't match. Also please don't change question about one problem into question about other problem. Harsh reality of this site is that it isn't to help only question asker but mostly to help people facing same problem in the future. So we try to gather questions about single problem per "topic" (even if your current problem will be solved and new one will appear you should ask about new problem in separate question - and yes you may need to wait to ask new question). Commented Jun 19, 2021 at 15:19
  • 1
    Since only posted answer is about second problem probably no one will get angry if you change question into problem which was being pointed out by in that answer (and don't forget to also update question title since currently it is about first problem). Commented Jun 19, 2021 at 15:43

1 Answer 1

6

Your logic is unnecessarily complex and error-prone.

I suggest you, instead of using unnecessarily complex and error-prone logic, simply use the regex, [A-Za-z0-9]+(?:\s*[\/*+\-<>]\s*[A-Za-z0-9]+\s*)*; which covers all the example strings you have posted in the question.

Explanation of the regex:

  • [A-Za-z0-9]+: 1+ alphabets or digits
  • (?:: Open non-capturing group
    • \s*: 0+ whitespace characters
    • [\/*+\-<>]: One of /, *, +, -, <, >
    • \s*: 0+ whitespace characters
    • [A-Za-z0-9]+: 1+ alphabets or digits
    • \s*: 0+ whitespace characters
  • ): Close non-capturing group
  • *: Quantifier to make the non-capturing group match 0+ times
  • ;: The charcter literal, ;

Demo:

import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        // Test
        Stream.of(
                    "first + second;",
                    "first * second;",
                    "first - second;",
                    "first / second;",
                    "first;",
                    "first + second < third;",
                    "third < second * first;"
        ).forEach(s -> System.out.println(isExpr(s)));      
    }

    public static boolean isExpr(String line) {
        return line.matches("[A-Za-z0-9]+(?:\\s*[\\/*+\\-<>]\\s*[A-Za-z0-9]+\\s*)*;");
    }
}

Output:

true
true
true
true
true
true
true

What went wrong with your code?

Because of the unnecessarily complex logic that you have implemented, one or more of the parentheses in the final regex have not been closed. In addition to that, I can see at least one part where the parenthesis has not been closed e.g.

String compOp = new String("(\\<|\\=");  // '<' or '='

It should be

String compOp = new String("(\\<|\\=)");  // '<' or '='
//----------------------------------^

Apart from this, given below are a couple of more things that you should learn/address:

  1. You can simplify the initialization like:
String factor = "[A-Za-z]+|[0-9]+";
String mulOp = "\\*|\\/";  // '*'' or '/'
String addOp = "\\+|\\-";  // '+' or '-'
String compOp = "\\<|\\=";  // '<' or '='
  1. Change a-Z to a-z.
Sign up to request clarification or add additional context in comments.

5 Comments

There are actually four missing closing parenthesis.
I updated, still gives error. I'm editing the post. Can you check it again?
@lapestand - Your code looks unnecessarily complex. Let me know what value you are passing to String line and I will suggest you a simpler code.
@lapestand - And, what is your expected output (true/false) for this string?
Sorry, It should be true. Because it corresponds to factor + factor which is simple-expr

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.