37

I'm trying to check if a string is a number, so the regex "\d+" seemed good. However that regex also fits "78.46.92.168:8000" for some reason, which I do not want, a little bit of code:

class Foo():
    _rex = re.compile("\d+")
    def bar(self, string):
         m = _rex.match(string)
         if m != None:
             doStuff()

And doStuff() is called when the ip adress is entered. I'm kind of confused, how does "." or ":" match "\d"?

5 Answers 5

35

\d+ matches any positive number of digits within your string, so it matches the first 78 and succeeds.

Use ^\d+$.

Or, even better: "78.46.92.168:8000".isdigit()

Sign up to request clarification or add additional context in comments.

4 Comments

\d+$ should be sufficient with match
$ doesn't work in the case of a trailing newline. See re.match(r'^\d+$', '4\n') for example.
what does the the $ and ^ do?
@CharlieParker: ^ matches the start of a line and $ matches the end.
30

There are a couple of options in Python to match an entire input with a regex.

Python 2 and 3

In Python 2 and 3, you may use

re.match(r'\d+$') # re.match anchors the match at the start of the string, so $ is what remains to add

or - to avoid matching before the final \n in the string:

re.match(r'\d+\Z') # \Z will only match at the very end of the string

Or the same as above with re.search method requiring the use of ^ / \A start-of-string anchor as it does not anchor the match at the start of the string:

re.search(r'^\d+$')
re.search(r'\A\d+\Z')

Note that \A is an unambiguous string start anchor, its behavior cannot be redefined with any modifiers (re.M / re.MULTILINE can only redefine the ^ and $ behavior).

Python 3

All those cases described in the above section and one more useful method, re.fullmatch (also present in the PyPi regex module):

If the whole string matches the regular expression pattern, return a corresponding match object. Return None if the string does not match the pattern; note that this is different from a zero-length match.

So, after you compile the regex, just use the appropriate method:

_rex = re.compile("\d+")
if _rex.fullmatch(s):
    doStuff()

3 Comments

do you need the r at the beginning of the regex?
@Charlie It is not required, but I'd use re.compile(r"\d+")
A note for all future visitors: it is a good idea NOT to omit the anchors even when using re.fullmatch. If your pattern must match the entire string, it is a good idea to keep the logic inside the pattern, since there may be scenarios with porting the pattern to other modules (like, e.g. Pyspark) where the regex functions are different and do not allow such flexibility.
14

re.match() always matches from the start of the string (unlike re.search()) but allows the match to end before the end of the string.

Therefore, you need an anchor: _rex.match(r"\d+$") would work.

To be more explicit, you could also use _rex.match(r"^\d+$") (which is redundant) or just drop re.match() altogether and just use _rex.search(r"^\d+$").

1 Comment

do you need the r at the beginning of the regex?
8

\Z matches the end of the string while $ matches the end of the string or just before the newline at the end of the string, and exhibits different behaviour in re.MULTILINE. See the syntax documentation for detailed information.

>>> s="1234\n"
>>> re.search("^\d+\Z",s)
>>> s="1234"
>>> re.search("^\d+\Z",s)
<_sre.SRE_Match object at 0xb762ed40>

Comments

5

Change it from \d+ to ^\d+$

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.