One way to do this is to make a pattern that matches either word (using \b so we only match complete words), use re.findall to check the string for all matches, and then use set equality to ensure that both words have been matched.
import re
stringA = "spam"
stringB = "egg"
words = {stringA, stringB}
# Make a pattern that matches either word
pat = re.compile(r"\b{}\b|\b{}\b".format(stringA, stringB))
data = [
"this string has spam in it",
"this string has egg in it",
"this string has egg in it and another egg too",
"this string has both egg and spam in it",
"the word spams shouldn't match",
"and eggs shouldn't match, either",
]
for s in data:
found = pat.findall(s)
print(repr(s), found, set(found) == words)
output
'this string has spam in it' ['spam'] False
'this string has egg in it' ['egg'] False
'this string has egg in it and another egg too' ['egg', 'egg'] False
'this string has both egg and spam in it' ['egg', 'spam'] True
"the word spams shouldn't match" [] False
"and eggs shouldn't match, either" [] False
A slightly more efficent way to do set(found) == words is to use words.issubset(found), since it skips the explicit conversion of found.
As Jon Clements mentions in a comment, we can simplify and generalize the pattern to handle any number of words, and we should use re.escape, just in case any of the words contain regex metacharacters.
pat = re.compile(r"\b({})\b".format("|".join(re.escape(word) for word in words)))
Thanks, Jon!
Here's a version that matches the words in the specified order. If it finds a match it prints the matching substring, otherwise it prints None.
import re
stringA = "spam"
stringB = "egg"
words = [stringA, stringB]
# Make a pattern that matches all the words, in order
pat = r"\b.*?\b".join([re.escape(word) for word in words])
pat = re.compile(r"\b" + pat + r"\b")
data = [
"this string has spam and also egg, in the proper order",
"this string has spam in it",
"this string has spamegg in it",
"this string has egg in it",
"this string has egg in it and another egg too",
"this string has both egg and spam in it",
"the word spams shouldn't match",
"and eggs shouldn't match, either",
]
for s in data:
found = pat.search(s)
if found:
found = found.group()
print('{!r}: {!r}'.format(s, found))
output
'this string has spam and also egg, in the proper order': 'spam and also egg'
'this string has spam in it': None
'this string has spamegg in it': None
'this string has egg in it': None
'this string has egg in it and another egg too': None
'this string has both egg and spam in it': None
"the word spams shouldn't match": None
"and eggs shouldn't match, either": None
stringAandstringB? Presumably they aren't actually strings because you're callingstr()on them.sis already a string then Python already knows it's a string object.str(s)simply returnss.stringAdoes not always come beforestringB? (Which that attempt suggests.) By the way:if x and yshould already be optimized as much as possible, so perhaps you are attempting premature optimization here.stronstringAandstringB.)