0

I am trying to validate a string against the following regular expression which has been imposed upon me:

[-,.:; 0-9A-Z&@$£¥€'"«»‘’“”?!/\\()\[\]{}<>]{3}[-,.:; 0-9A-Z&@$£¥€'"«»‘’“”?!/\\()\[\]{}<>*=#%+]{0,157}

Can anybody help with writing a preg_match in PHP to validate an input string against this? I am struggling because:

  1. my knowledge of regex isn't that great in the first place
  2. I see special characters in the regex itself which I feel sure PHP won't be happy about me inserting directly into a string (e.g. $£¥€)

In vain hope I just tried sticking it into preg_match, escaping the double quotes, thus:

$ste = "Some string input";

if(preg_match("/[-,.:; 0-9A-Z&@$£¥€'\"«»‘’“”?!/\\()\[\]{}<>]{3}[-,.:; 0-9A-Z&@$£¥€'\"«»‘’“”?!/\\()\[\]{}<>*=#%+]{0,157}/",$ste))
{
    echo "OK";  
}
else
{
    echo "Not OK";  
}

Thanks in advance!!

3
  • You already have your regex. So what's this all about? Commented Oct 17, 2013 at 10:15
  • So, you want a string that can match a regular expression which function you are not aware off? Why? Commented Oct 17, 2013 at 10:16
  • You should enable error reporting, you would have noticed that you should escape / in your regex or replace the delimiters to something like ~. Commented Oct 17, 2013 at 10:17

2 Answers 2

1

You can do that:

if (preg_match('~^[ -"$&-),-<>?-\]{}£¥€«»‘’“”]{3}[ -\]{}£¥€«»‘’“”]{0,157}$~u', $ste))
    echo 'OK';
else
    echo 'Not OK';

I have added the "u" modifier for unicode, and reduced the size of the character classes using ranges (example:,-< means all characters between , and < in the unicode table).

But the most important, I have added anchors ^ and $ that means respectivly start and end of the string.

Sign up to request clarification or add additional context in comments.

Comments

0

PHP will be perfectly happy with the "special" characters in the expression, provided you do the following:

  1. Make sure the input string is encoded with UTF-8 encoding.

  2. Make sure your PHP program file is saved using UFT-8 encoding. (and obviously you'll need to use UTF-8 encoding in all other parts of your system too, or you'll get borked characters showing up somewhere along the line, but that's outside the scope of this question)

  3. Add the add the u modifier to the end of the regex pattern string to tell the regex parser to handle UTF-8 characters. ie:

    preg_match("/....../u", ...);
                        ^
                     add this
    

Other than that, you've got it pretty much spot on already.

13 Comments

I'll ruin the mood, somewhere in the expression there is something like this “”?!/\\(). Now notice / in the expression, it's also used as delimiter, which means it will throw an error unknown modifier ....
Thank you for the non-critical answer. Appreciate your time. It's all UTF8 files and encoding in the system. I tried adding the u at the end but it's still not correctly identifying the pattern :(
@DanielProcter: HamZa does have a point about the rogue slash; I missed that (I assumed the pattern was already fully escaped, since it has been given to you as a fait-accompli, but obviously not). Simplest solution to that is to use a different delimiter, which isn't in the string. Maybe use ~ instead of / at the start and end of the expression. (or you could escape the / characters in the pattern, but if the pattern has been imposed on you, its probably best not to be changing it if you can help it)
Thank you for this. I really appreciate it. I've changed the delimiters to ~ but it's still not behaving as expected: $ste = utf8_encode("+HELLO STEVE"); if(preg_match("~[-,.:; 0-9A-Z&@$£¥€'\"«»‘’“”?!/\\()\[\]{}<>]{3}[-,.:; 0-9A-Z&@$£¥€'\"«»‘’“”?!/\\()\[\]{}<>*=#%+]{0,157}~u",$ste)) { echo "OK"; } else { echo "Not OK"; }
NB: the utf8_encode() call may or may not be necessary, depending on how the string is already encoded.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.