Is there a way to see if a line contains words that matches a set of regex pattern?
If I have [regex1, regex2, regex3], and I want to see if a line matches any of those, how would I do this?
Right now, I am using re.findall(regex1, line), but it only matches 1 regex at a time.
-
See also: stackoverflow.com/a/7726095/1599699Andrew– Andrew2023-06-22 17:25:24 +00:00Commented Jun 22, 2023 at 17:25
5 Answers
You can use the built in functions any (or all if all regexes have to match) and a Generator expression to cycle through all the regex objects.
any (regex.match(line) for regex in [regex1, regex2, regex3])
(or any(re.match(regex_str, line) for regex in [regex_str1, regex_str2, regex_str2]) if the regexes are not pre-compiled regex objects, of course)
However, that will be inefficient compared to combining your regexes in a single expression. If this code is time- or CPU-critical, you should try instead to compose a single regular expression that encompasses all your needs, using the special | regex operator to separate the original expressions.
A simple way to combine all the regexes is to use the string join method:
re.match("|".join([regex_str1, regex_str2, regex_str2]), line)
A warning about combining the regexes in this way: It can result in wrong expressions if the original ones already do make use of the | operator.
2 Comments
'(' + ')|('.join(['foo', 'bar', 'baz']) + ')' gives '(foo)|(bar)|(baz)'.(?:...), and put the string together in a way that highlights its logical structure. '|'.join('(?:{0})'.format(x) for x in ('foo', 'bar', 'baz')) for example.Try this new regex: (regex1)|(regex2)|(regex3). This will match a line with any of the 3 regexs in it.
4 Comments
(?:...) is probably a better idea than (...) here, to avoid creating spurious capture groups..group(n) to determine which group you captured.#quite new to python but had the same problem. made this to find all with multiple
#regular #expressions.
regex1 = r"your regex here"
regex2 = r"your regex here"
regex3 = r"your regex here"
regexList = [regex1, regex1, regex3]
for x in regexList:
if re.findall(x, your string):
some_list = re.findall(x, your string)
for y in some_list:
found_regex_list.append(y)#make a list to add them to.
1 Comment
You can do this with a list comprehension. I was trying to identify the fields in a table that matched certain patterns. The input was a list of column names, a list of matches is returned.
def find_client_fields(cols=None):
field = [] <==== variable to hold list of matches
regex_list = [r'.*customer.*' <==== list of regexes
,r'.*vendor.*'
,r'.*user.*'
,r'.*source.*']
[field.append(x) for x in cols for regex in regex_list if re.match(regex, x, re.IGNORECASE)] <=== list comprehension to find matches and ignore the case
return list(set(field)). <=== in case there are repeated names
find_client_fields(cols=temp.columns)