3

I am trying to use Python regular expression to validate the value of a variable.

The validation rules are as follows:

  • The value can contain any of a-z, A-Z, 0-9 and * (no blank, no -, no ,)
  • The value can start with a number (0-9) or alphabet (a-z, A-Z) or *
  • The value can end with a number (0-9) or alphabet (a-z, A-Z) or *
  • In the middle, the value can contain a number (0-9) or alphabet (a-z, A-Z) or *
  • Any other values must not be allowed

Currently I am using the following snippet of code to do the validation:

import re
data = "asdsaq2323-asds"
if re.compile("[a-zA-Z0-9*]+").match(data).group() == data:
    print "match"
else:
    print "no match"

I feel there should be a better way of doing the above. I am looking for something like the following:

validate_func(pattern, data) 
/* returns data if the data passes the validation rules */
/* return None if the data does not passes the validation rules */
/* should not return part of the data which matches the validation rules */

Does one such build-in function exist?

1
  • [\w\d\*]+ does not suffice? Best way to test is to do a mass-loop to test all your values you already have (in a list/file?). When a match in the loop fails you should print it out and do more research on it what goes wrong and see how you can optimize your RegEx. Commented Mar 22, 2013 at 22:38

3 Answers 3

6

In a regex, the metacharacters ^ and $ mean "start-of-string" and "end-of-string" (respectively); so, rather than seeing what matches, and comparing it to the whole string, you can simply require that the regex match the whole string to begin with:

import re
data = "asdsaq2323-asds"
if re.compile("^[a-zA-Z0-9*]+$").match(data):
    print "match"
else:
    print "no match"

In addition, since you're only using the regex once — you compile it and immediately use it — you can use the convenience method re.match to handle that as a single step:

import re
data = "asdsaq2323-asds"
if re.match("^[a-zA-Z0-9*]+$", data):
    print "match"
else:
    print "no match"
Sign up to request clarification or add additional context in comments.

Comments

3

To make sure the entire string matches your pattern, use beginning and end of string anchors in your regex. For example:

regex = re.compile(r'\A[a-zA-Z0-9*]+\Z')
if regex.match(data):
    print "match"
else:
    print "no match"

Making this a function:

def validate_func(regex, data):
    return data if regex.match(data) else None

Example:

>>> regex = re.compile(r'\A[a-zA-Z0-9*]+\Z')
>>> validate_func(regex, 'asdsaq2323-asds')
>>> validate_func(regex, 'asdsaq2323asds')
'asdsaq2323asds'

As a side note, I prefer \A and \Z over ^ and $ for validation like this the meaning of ^ and $ can change depending on the flags used, and $ will match just before a line break characters at the end of the string.

Comments

2

I think you're looking for

re.match('^[a-zA-Z0-9*]+$',data) and data

The extra and data is just to return data, but I'm not sure why you need that. Checking the re.match result against None is enough to check whether the string is valid.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.