39

I am aware that regEx are common across languages...But I am having trouble in writing the Java syntax. I have a regular expression coded in JS as;

if((/[a-zA-Z]/).test(str) && (/[0-9]|[\x21-\x2F|\x3A-\x40|\x5B-\x60|\x7B-\x7E]/).test(str))         
return true;

How do I write the same in Java ?

I have imported

import java.util.regex.Matcher;
import java.util.regex.Pattern;

Just to add, from what I am trying it is saying \x is an invalid escape character..

7 Answers 7

52

Change the leading and trailing '/' characters to '"', and then replace each '\' with "\\".

Unlike, JavaScript, Perl and other scripting languages, Java doesn't have a special syntax for regexes. Instead, they are (typically) expressed using Java string literals. But '\' is the escape character in a Java string literal, so each '\' in the original regex has to be escaped with a 2nd '\'. (And if you have a literal backslash character in the regex, you end up with "\\\\" in the Java string literal!!)

This is a bit confusing / daunting for Java novices, but it is totally logical. Just remember that you are using a Java string literal to express the regex.


However as @antak notes, there are various differences between the regex languages implemented by Java and JavaScript. So if you take an arbitrary JavaScript regex and transliterate it to Java (as above) it might not work.

Here are some references that summarize the differences.

Sign up to request clarification or add additional context in comments.

4 Comments

Thx a lot..I did not quite get the 2nd part...replace each '\' with '\' .....Are they both not the same ?
@testndtv - I said "has to be escaped with a 2nd backslash". I didn't say it has to be replaced.
If only it could be this easy... Regex in Java and JS have subtle differences that'll bite those unaware: e.g. JS: 'ab]cd'.replace(/[^]]/g, '()') -> a()cd, Java: "ab]cd".replaceAll("[^]]", "()") -> ()()]()()
I second what @antak said. I get errors due to ^ and [], and nothing to do with the slashes. If you have this problem, see the accepted answer to stackoverflow.com/questions/37978665/…
37

You can use online regex evaluators like https://regex101.com for conversion.

  1. Go to https://regex101.com
  2. Choose ECMAScript (JavaScript) FLAVOR
  3. Insert your regex
  4. Open TOOLS -> Code Generator (LANGUAGE - Java)
  5. Copy-paste

Even though it isn't hardcore programmer way, it is significantly less error-prone. Especially if you need to convert only one or two expressions.

1 Comment

I had stared at some long JS regex and re-programmed that NPM module into a Java package... I carefully escaped the escaping escapes to escape my headache... At the end of this escapade, this answer helped me escape my doubts that my escape sequences had escaped some of the encoding escape logic!
4

If you really need Javascript regex semantics in Java, one approach would be to use the embedded Javascript engine to evaluate the regex. For example:

javax.script.ScriptEngineManager se = new javax.script.ScriptEngineManager();
javax.script.ScriptEngine engine = se.getEngineByName("js");

String regExp = "/^\\d+$/";
engine.put("str", "1234");
engine.eval("var rgx=" + regExp);
Object value = engine.eval(
    "function validate(r, s){ return (r).test(s);};validate(rgx, str);");
logger.log(value);

Comments

3

The only thing you have to do is to duplicate back slashes.

Pattern p1 = Pattern.compile("[a-zA-Z]");
Pattern p2 = Pattern.compile("[0-9]|[\\x21-\\x2F|\\x3A-\\x40|\\x5B-\\x60|\\x7B-\\x7E]");

if (p1.matcher(str).find() && p2.matcher(str).find()) {
    return true;
}

3 Comments

@testndtv: Don't ever say you are getting an error without showing the error message.
It was missing the second find() call; try it now.
Thx again...Actually I replaced with .matches() which I hope should also be fine...Pls confirm..
1

Javascript regex pattern to java 8 regex pattern.

Above comment I think he forget to mention some point when we convert such complex Javascript based regex pattern. For example below regex basically for email validate.

^(([^<>#&%/?~()[].,;:|\s@"]+(.[^<>#&%/?~()[].,;:|\s@"]+)*)|(".+"))@((<>#&%/?~()[].,;:|\s@"]+.)+<>#&%/?~()[].,;:|\s@"]+)$

Go to https://regex101.com

Insert your regex.

Then select Java 8 refer below screenshot.

You can see in below screenshot there is some error showing in right hand side just fix that error and copy the same script and it will work same as in JS.

enter image description here

I fixed that error which was showing please refer below screenshot. It will give true if user have given invalid mail address.

enter image description here

Adding below code snap which help you to test string cases

import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        String testEmailAddress = "[email protected]";
        Pattern _PATTERN = Pattern.compile("^(([^<>#&%/?~()\\[\\]\\.,;:|\\s@\\\"]+(\\.[^<>#&%/?~()\\[\\]\\.,;:|\\s@\\\"]+)*)|(\\\".+\\\"))@(([^<>#&%/?~()\\[\\]\\.,;:|\\s@\\\"]+\\.)+[^<>#&%/?~()\\[\\]\\.,;:|\\s@\\\"]+)$");

        if (!_PATTERN .matcher(testEmailAddress).matches()) {
            System.out.println(true);
        } else {
            System.out.println(false);
        }
    }
}

Comments

0

Java regular expressions are first and foremost strings, so you must start with double quotes and not /. Also, in java, you need to escape the \ by doing two of them like so \\.

Take a look at this tutorial from Oracle for more information.

2 Comments

It's not the regex engine itself, it's the matches() method that requires the regex to consume the whole string, as if it were anchored at both ends. The find() method performs the more traditional kind of match, but you can't call it from the String object like you can with matches(); you have to explicitly create a Matcher object like AlexR did.
@AlanMoore: Thanks for letting me know. I have rarely used find, I have almost always used matches(). I have removed that section from my answer, thanks again for the clarification.
0

If you want to use the same regex in Javascript as well as Java try to get the regex string at runtime, rather than trying to define the regex at compile time. At compile time it will check for syntax and it will give you invalid escape character error, however at runtime it will not check for the syntax and will directly compile the pattern.

If you can get the regex from API or can read it from locally stored text file, it will great.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.