0

When I am trying to execute the below code

text.matches("[a-zA-Z0-9 !\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~]");

I am getting exception

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 43
[a-zA-Z0-9 !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~]
                                           ^
    at java.util.regex.Pattern.error(Unknown Source)
    at java.util.regex.Pattern.clazz(Unknown Source)
    at java.util.regex.Pattern.sequence(Unknown Source)
    at java.util.regex.Pattern.expr(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.util.regex.Pattern.<init>(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.util.regex.Pattern.matches(Unknown Source)
    at java.lang.String.matches(Unknown Source)
    at test.G3Utils.checkIsAttribANS(G3Utils.java:47)
    at test.G3Utils.main(G3Utils.java:6)

Please help me to solve this

3
  • @Matthew: I reverted your edit because he was using two backslashes all along; they just weren't displaying properly because he didn't use code formatting. If he had been using just one, the code (the Java code, that is) wouldn't have compiled. Commented Feb 12, 2015 at 11:31
  • ah! That makes a lot of sense. Commented Feb 12, 2015 at 11:32
  • Note that you are lucky that ,-. are next to each other in the ASCII table, so the range from comma to dot is the same as specifying those 3 characters. Usually, it's less confusing to write - at the end of the character class. Commented Feb 12, 2015 at 11:47

3 Answers 3

2

You need to escape the [ character like you have escaped the ] character.

So the fixed version of your regex is:

[a-zA-Z0-9 !\"#$%&'()*+,-./:;<=>?@\\[\\]^_`{|}~]
Sign up to request clarification or add additional context in comments.

Comments

1

When using literal ] within a list in regex you should put it as the first characters otherwise the Parser will not understand it. However, Java also accepts escaping it, see next paragraph.

And for Java you need to escape [, with \, but you need to escape it in Java to use it as literal string, so replace [ by \\[

This will make your Regex work:

text.matches("[]a-zA-Z0-9 !\"#$%&'()*+,-./:;<=>?@\\[^_`{|}~]");

Other thing note that ,-. is matching the interval from comma until dot, if that's not the desired behavior move the - to the last position. (It works because the ASCII table order is ,, - and .).

9 Comments

That causes the same compile error. He needs to escape the [ and ] instead. If you put just the ] at the start then it does work as you describe.
No, empty lists [] are invalid that's why it works.
Pattern.compile("[][a-z]") causes "Unclosed character class". Pattern.compile("[]a-z]") works.
As always RegExp has different implmenetations, Java seems to require us to escape ] within the RegExp, I believe it shouldn't require this, but...
@Luizgrs: Java support character class union and intersection, so [] both have special meaning in character class.
|
0

You can improve your regex like this:

"[\\w !\"#$%&'()*+,-./:;<=>?@\\[\\]^_`{|}~]"

Here you can take advantage of the predefined character class \w.
\w is equivalent to [a-zA-Z0-9].

Reference: Java Regular Expressions: Predefined Character Classes

1 Comment

Pardon my presumption in correcting your code, but you were using the escapes inconsistently. I find it's best to display the regex as a Java string literal, just as it would appear in the source code. It makes it easier to explain that it's the Java compiler that requires the double escapes, not the regex compiler.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.