I'm writing a script to search a logfile for a given python regex pattern. Setting aside the fact that this would be much easier to do using a simple Bash script, can it be done in Python? Here's what I've run into:
Assumptions:
- I'm trying to analyze the file
/var/log/auth.log- (for the sake of simplicity, I'm omitting the ability to choose a file.)
- the name of my cli module is
logscour. - for the sake of argument,
logscourtakes only one arg calledregex_in.
Intended usage:
[root@localhost]: # logscour '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'
Should return the lines inside of /var/log/auth.log that contain an IPv4 address.
I want to find a sort of anti-re.escape(), as I am in backslash-hell. Here's a snippet:
import re
import argparse
def main(regex_in, logfile='/var/log/auth.log'):
## Herein lies the problem!
# user_regex_string = re.escape(regex_in) #<---DOESN'T WORK, EVEN MORE ESCAPE-SLASHES
# user_regex_string = r'{}'.format(regex_in) #<---DOESN'T WORK
user_regex_string = regex_in #<---DOESN'T WORK EITHER GAHHH
with open(logfile, 'rb+') as authlog:
for aline in authlog:
if re.match(user_regex_string, aline):
print aline
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("regex_in", nargs="?", help="enter a python-compliant regex string. Parentheses & matching groups not supported.", default=None)
args = parser.parse_args()
if not args.regex_in:
raise argparse.ArgumentError('regex_in', message="you must supply a regex string")
main(args.regex_in)
This is giving me back nothing, as one would expect due to the fact that I'm using Python2.7 and these are bytestrings I'm dealing with.
Does anyone know a way to convert 'foo' to r'foo', or an "opposite" for re.escape()?
re.matchimplicitly anchors the regex to the start of the line. You might wantre.searchinstead.'foo'andr'foo'are the same thing. The purpose of ther''prefix isn't to turn a string into a regex; it's to keep the Python interpreter from treating escape sequences like\ninside the string specially and instead pass them through raw.shorsubprocessallowed. : [''s from my input arg. I removed those & @Eric Dunhill's advice worked.