0

I have a requirement to build a regex pattern to validate a String in Java. Hence I build a pattern [A-Z][a-z]*\s?[A-Z]?[a-z]*$ for the conditions:

  • Should start with caps
  • Every other Word should start with caps
  • No numbers included
  • no consecutive two spaces allowed

Pattern.matches("[A-Z][a-z]*\s?[A-Z]?[a-z]*$","Joe V") returns false for me in java. But the same pattern returns true for the data "Joe V" in regexr.com.

What might be the cause?

3
  • 2
    Are you sure about s?, it seems you expect that to match a space..., but that would need to be \s?, and in a string literal with escaped backslash... Commented Nov 17, 2022 at 9:17
  • Yeah, you're right. that was misspelled one. Commented Nov 17, 2022 at 9:27
  • But still didn't escape that backslash -- because of the Java string literal it is in. Commented Nov 17, 2022 at 9:28

2 Answers 2

1

Javascript has native support for regex while Java doesn't. Since Java uses \ for special signs in strings (like \n) you have to escape the \ to actually be a \ sign. That's done with another \. So any \ you use in Java should be written as \\.

Thus your regex / code should be:

Pattern.matches("[A-Z][a-z]*\\s?[A-Z]?[a-z]*$", "Joe V")

which returns true

P.s. \s is interpreted as a Space in any Java-String

Sign up to request clarification or add additional context in comments.

1 Comment

\s being interpreted as a space is a recent addition (Java 15, when text blocks where released)
0

You can use

Pattern.matches("[A-Z][a-z]*(?:\\s[A-Z][a-z]*)*","Joe V")
Pattern.matches("\\p{Lu}\\p{Ll}*(?:\\s\\p{Lu}\\p{Ll}*)*","Joe V")

See the regex demo #1 and regex demo #2.

Note that .matches requires a full string match, hence the use of ^ and $ anchors on the testing site and their absence in the code.

Details:

  • ^ - start of string (implied in .matches)
  • [A-Z] / \p{Lu} - an (Unicode) uppercase letter
  • [a-z]* / \p{Ll}* - zero or more (Unicode) lowercase letters
  • (?:\s[A-Z][a-z]*)* / (?:\s\p{Lu}\p{Ll}*)* - zero or more sequences of
    • \s - one whitespace
    • [A-Z][a-z]* /\p{Lu}\p{Ll}* - an uppercase (Unicode) letter and then zero or more (Unicode) lowercase letters.
  • $ - end of string (implied in .matches)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.