3

I'm having a bit of trouble getting my pattern to validate the string entry correctly. The PHP portion of this assignment is working correctly, so I won't include that here as to make this easier to read. Can someone tell me why this pattern isn't matching what I'm trying to do?

This pattern has these validation requirements:

  1. Should first have 3-6 lowercase letters
  2. This is immediately followed by either a hyphen or a space
  3. Followed by 1-3 digits

    $codecheck = '/^([[:lower:]]{3,6}-)|([[:lower:]]{3,6} ?)\d{1,3}$/';
    

Currently this catches most of the requirements, but it only seems to validate the minimum character requirements - and doesn't return false when more than 6 or 3 characters (respectively) are entered.

Thanks in advance for any assistance!

4
  • 1
    Instead of the "or" pipe, use [\s-] to look for a - or a space. Commented Mar 22, 2016 at 19:45
  • You don't need to duplicate the lower /^[[:lower:]]{3,6}[\s-]\d{1,3}$/ Commented Mar 22, 2016 at 19:55
  • 1
    I don't think design-patterns tag is appropriate here. You should maybe use regex instead. Commented Mar 22, 2016 at 19:55
  • @AbraCadaver, please write it as an answer and not a comment. Commented Mar 22, 2016 at 20:11

2 Answers 2

4

The problem here lies in how you group the alternatives. Right now, the regex matches a string that

  • ^([[:lower:]]{3,6}-) - starts with 3-6 lowercase letters followed with a hyphen
  • | - or
  • ([[:lower:]]{3,6} ?)\d{1,3}$ - ends with 3-6 lowercase letters followed with an optional space and followed with 1-3 digits.

In fact, you can get rid of the alternation altogether:

$codecheck = '/^\p{Ll}{3,6}[- ]\d{1,3}$/';

See the regex demo

Explanation:

  • ^ - start of string
  • \p{Ll}{3,6} - 3-6 lowercase letters
  • [- ] - a positive character class matching one character, either a hyphen or a space
  • \d{1,3} - 1-3 digits
  • $ - end of string
Sign up to request clarification or add additional context in comments.

3 Comments

@Shaw: Just to clarify that your regex can work, too, once the alternations are placed into a group: ^(?:([[:lower:]]{3,6}-)|([[:lower:]]{3,6} ?))\d{1,3}$. However, it is not efficient.
Thanks for pointing that out - as I'm just learning it, it's good to know I was on the right track for educational purposes, but not yet seeing the whole picture in regards to efficiency.
When using alternation like this, backtracking is increasing the number of steps needed to return a match/or detect a non-matching string. See the green box showing number of steps with your regex and mine. Although that number of steps is not a direct indicator of performance, when the step count is doubled like in this case, it hints at performance difference between the two patterns.
2

You need to delimit the scope of the | operator in the middle of your regex.

As it is now:

  • the right-side argument of that OR runs up until the very end of your regex, even including the $. So the digits, nor the end-of-string condition do not apply for the left side of the |.

  • the left-side argument of the OR starts with ^, and only applies to the left side.

That is why you get a match when you supply 7 lowercase characters. The first character is ignored, and the rest matches with the right-side of the regex pattern.

1 Comment

Thanks for breaking it down!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.