2

I'm trying to use python to parse a log file and match 4 pieces of information in one regex. (epoch time, SERVICE NOTIFICATION, hostname and CRITICAL) I can't seem to get this to work. So Far I've been able to only match two of the four. Is it possible to do this? Below is an example of a string from the log file and the code I've gotten to work thus far. Any help would make me a happy noob.

[1242248375] SERVICE ALERT: myhostname.com;DNS: Recursive;CRITICAL;SOFT;1;CRITICAL - Plugin timed out while executing system call

hostname = options.hostname

n = open('/var/tmp/nagios.log', 'r')
n.readline()
l = [str(x) for x in n]
for line in l:
    match = re.match (r'^\[(\d+)\] SERVICE NOTIFICATION: ', line)
    if match:
       timestamp = int(match.groups()[0])
       print timestamp

5 Answers 5

6

You can use | to match any one of various possible things, and re.findall to get all non-overlapping matches to some RE.

Sign up to request clarification or add additional context in comments.

Comments

2

The question is a bit confusing. But you don't need to do everything with regular expressions, there are some good plain old string functions you might want to try, like 'split'.

This version will also refrain from loading the entire file in memory at once, and it will close the file even when an exception is thrown.

regexp = re.compile(r'\[(\d+)\] SERVICE NOTIFICATION: (.+)')
with open('var/tmp/nagios.log', 'r') as file:
    for line in file:
        fields = line.split(';')
        match = regexp.match(fields[0])
        if match:
            timestamp = int(match.group(1))
            hostname = match.group(2)

Comments

2

You can use more than one group at a time, e.g.:

import re

logstring = '[1242248375] SERVICE ALERT: myhostname.com;DNS: Recursive;CRITICAL;SOFT;1;CRITICAL - Plugin timed out while executing system call'
exp = re.compile('^\[(\d+)\] ([A-Z ]+): ([A-Za-z0-9.\-]+);[^;]+;([A-Z]+);')
m = exp.search(logstring)

for s in m.groups():
    print s

2 Comments

Just FYI, exp.match(logstring) works just as well in this example. I.e., the solution ISN'T use search() instead of match().
Sure, good point. I'm in the habit of using search instead of match, but since we're starting at the beginning of the string it's the same thing. The key is adding four different grouping parens to grab the four things the OP wants.
1

If you are looking to split out those particular parts of the line then.

Something along the lines of:

match = re.match(r'^\[(\d+)\] (.*?): (.*?);.*?;(.*?);',line)

Should give each of those parts in their respective index in groups.

Comments

0

Could it be as simple as "SERVICE NOTIFICATION" in your pattern doesn't match "SERVICE ALERT" in your example?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.