In many of my python projects, I find myself having to go through a file, match lines against regexes, and then perform some computation on the basis of elements from the line extracted by regex.
In pseudo-C code, this is pretty-easy:
while (read(line))
{
if (m=matchregex(regex1,line))
{
/* munch on the components extracted in regex1 by accessing m */
}
else if (m=matchregex(regex2,line))
{
/* munch on the components extracted in regex2 by accessing m */
}
else if ...
...
else
{
error("Unrecognized line format");
}
}
However, because python does not allow an assignment in the conditional of an if, this can't be done elegantly. One could first parse against all the regexes and then do the if on the various match objects, but that is neither elegant nor efficient.
What I find myself doing instead is including code like this at the base level of every project:
im=None
img=None
def imps(p,s):
global im
global img
im=re.search(p,s)
if im:
img=im.groups()
return True
else:
img=None
return False
Then I can work like this:
for line in open(file,'r').read().splitlines():
if imps(regex1,line):
# munch on contents of img
elsif imps(regex2,line):
# munch on contents of img
else:
error('Unrecognised line: {}'.format(line))
That works, is reasonably compact, and easy to type. But it is hardly beautiful; it uses global variables and is not thread safe (which has not been an issue for me so far).
But I'm sure others have run across this problem before and come up with an equally compact, but more python-y and generally superior solution. What is it?