0

I just started learning python and regular expressions.

How can I get an output of lines from a log file, where multiple keywords match in the single line?

eg: Get lines which start with "5" and have a timestamp "2014/05/12 02:30:00"? here I have to append these 2 pieces of information and get the count of lines which satisfy it from reading the whole log file.

Keywords are separated by commas in log file - this is one line from the file:

5,14/05/0202:30:00,1,1,94776082043,94776082043,0,1,77100,0,1,77100,,14/05/02 02:30:00,9477000003,,,,,19,14/05/05 02:30:00,0,0,9477000007,9477000003,false,,,,,,,,true,,,0,,5011405020230005752,

Here is the code I have already, which I want to improve:

#!/usr/bin/env python

import re

path = raw_input("Enter path to log file \n")
#/home/harzyne/pythonscripts/smsc_log.old
log = open(path)

count = 0

start = raw_input("Enter Start_Time in format(hh:mm:ss) ")
print(start)
end = raw_input("Enter End_Time in format(hh:mm:ss)")
print(end)

for line in log:
    if re.search('^5', line) :
        count +=1
print count
2
  • Could you give an example of a log line? Commented May 29, 2014 at 15:35
  • This is one line from the log_file : 5,14/05/0202:30:00,1,1,94776082043,94776082043,0,1,77100,0,1,77100,,14/05/02 02:30:00,9477000003,,,,,19,14/05/05 02:30:00,0,0,9477000007,9477000003,false,,,,,,,,true,,,0,,5011405020230005752, Commented May 29, 2014 at 15:36

1 Answer 1

1

Would extending the regex to look like this work for your lines? I'm just including the timestamp.

re.search('^5.*?14/05/0202:30:00',line)

or if you want to only look at the very next field, just replace the .*? with a comma:

re.search('^5,14/05/0202:30:00',line)
Sign up to request clarification or add additional context in comments.

8 Comments

Sorry i'm a beginner to regex ... Is this "\ /" represents "OR" operator ?? If the log line contains both "5" and "timestamp" only it should count the line.
I was escaping the : and / it unnecessarily. I removed that. If you use characters that could be interpreted as a special regex character, then you need to escape it. When I am developing, I have a habit of escaping these characters more then needed and just fix it later, that's all.
Thank you !! btw, if these time stamp i have to get as user input actually.. so how could i include that into regex ?? (pls check my code above)
If you are looking for a specific timestamp, you could just format the string. '^5,{timestamp}'.format(timestamp='14/05/0202:30:00') or you might have to make sure the value is formatting exactly right
ok thank u:) If i want to concatenate a user input (for eg: "start" variable in my code above) how to do it? is it like "('^5'.+start) ?? but this plus sign didin't work for me..but googling resulted this way :(
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.