13

I couldn't seem to find a thread on this one, but it seems like something that should be pretty simple. I'm trying to use regex to search a line in an output for a number 0-99, and do one action, but if the number is 100 then it'll do a different action. Here's what I have tried(simplified version):

OUTPUT = #Some command that will store the output in variable OUTPUT
OUTPUT = OUTPUT.split('\n')
for line in OUTPUT:
    if (re.search(r"Rebuild status:  percentage_complete", line)): #searches for the line, regardless of number
        if (re.search("\d[0-99]", line)): #if any number between 0 and 99 is found
            print("error")
        if (re.search("100", line)): #if number 100 is found
            print("complete")

I've tried this and it still picks up the 100 and prints error.

3 Answers 3

11

This: \d[0-99] means a digit (\d), followed by a number (0-9) or 9. If you are after the numeric range of [0-99], you would need to use something akin to \b\d{1,2}\b. This will match any numeric value which is made of 1 or 2 digits.

Sign up to request clarification or add additional context in comments.

4 Comments

Yes I've tried that as well but it'll still print out the error when it finds a 100
@bladexeon: The problem was that 100 was technically, a valid match for that regular expression (it would match the value of 10). I have amended the expression to include word boundaries (the \b) to cope with that.
Your approach made the most sense, but I could not make the bounds work. so instead I did: if re.search("\d{1,2}", line) and not re.search("100", line): That seemed to work. Thanks for the help!
@bladexeon: If that is the case, then I recommend going for what PM 2Ring suggests.
4

You can simplify your regexes by re-ordering your number tests, and using elif instead of if on the test for 2 digit numbers.

for line in output:
    if re.search("Rebuild status:  percentage_complete", line): 
        if re.search("100", line):
            print "complete"
        elif re.search(r"\d{1,2}", line): 
            print "error"

The test for a 2 digit number is performed only if the test for "100" fails.

Using a raw string isn't strictly necessary with r"\d{1,2}" (in Python 2) but it's a good habit to use a raw string for any regex that contains a backslash. In Python 3, you must use a raw string, otherwise you get:

DeprecationWarning: invalid escape sequence '\d'

Note that you don't need parentheses around conditions in Python, so using them just adds unnecessary clutter.


As dawg mentions in the comments, the test for "100" can be tightened to re.search(r"\b100\b", line), but that's not needed if we can guarantee that we're only testing integer percentages in the range 0 - 100.

4 Comments

Your approach made me think of another: Match on \d+. Cast the matched group to int and compare numerically rather than with regex. It uses regex for what it's good at, but treats numbers as numbers rather than text.
@StevenRumbalski: I suppose that'd be more efficient, since it reduces the number of regex searches performed . OTOH, we could replace the search for "100" with a simple str.find("100")...
Note that re.search("100", line) will match 1000, -100, 100.5 etc. Should be re.search(r"\b100\b", line)
@dawg: Good call, but I was being lazy, since the numbers are percentages, so those possibilities won't occur.
1

0 - 99:

>>> s='\n'.join(["line {} text".format(i) for i in range(-2,101) ])
>>> import re
>>> re.findall(r'(?<!\-)\b(\d\d|\d)\b', s)
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99']

The regex '(?<!\-)\b(\d\d|\d)\b' matches 2 digits 0-99 and does not match negative numbers such as -9

Demo

100 is easy: '(?<!\-)\b100\b'

If you do not want to match floats: \b(?<![-.])(\d\d|\d)(?!\.)\b

Demo

1 Comment

Is there anything similarly simple to match digits 0-199?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.