4

I have a Java regex:

^[a-zA-Z_][a-zA-Z0-9_]{1,126}$

It means:

  • Begin with an alphabetic character or underscore character.
  • Subsequent characters may include letters, digits or underscores.
  • Be between 1 and 127 characters in length.

Now, I want to replace a string having characters not in that regex with a underscore.

Example:

final String label = "23_fgh99@#";
System.out.println(label.replaceAll("^[^a-zA-Z_][^a-zA-Z0-9_]{1,126}$", "_"));

But the result is still 23_fgh99@#.

How can I "convert" it to _3_fgh99__?

3
  • Because "23_fgh99@#" begins with 2, not [a-zA-Z_], no matches. Commented May 21, 2015 at 10:17
  • Try label.replaceAll("^[^a-zA-Z_]|(?<!^)[^a-zA-Z0-9_]", "_"). It outputs _3_fgh99__, though. Commented May 21, 2015 at 10:17
  • Oops, it seems I have an answer after an edit :). Commented May 21, 2015 at 10:19

1 Answer 1

4

Use this code:

final String label = "23_fgh99@#";
System.out.println(label.replaceAll("^[^a-zA-Z_]|(?<!^)[^a-zA-Z0-9_]", "_"));

It outputs _3_fgh99__.

To remove what is "not in the original pattern", you need to negate the first character class and only check a character at the beginning (^[^a-zA-Z_]), and then check other characters not at the beginning with the negated second character class ((?<!^)[^a-zA-Z0-9_]). Then, we just use an alternation symbol | to apply both patterns in 1 replacement operation.

Sign up to request clarification or add additional context in comments.

3 Comments

Please try to explain your answer for future users.
@MarounMaroun: Sure, I explain my answers in most cases, and this case seems quite peculiar.
@java_guy: Please consider accepting the answer if it worked for you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.