5

I want to search a string and see if it contains any of the following words: AB|AG|AS|Ltd|KB|University

I have this working in javascript:

var str = 'Hello test AB';
var forbiddenwords= new RegExp("AB|AG|AS|Ltd|KB|University", "g");

var matchForbidden = str.match(forbiddenwords);

if (matchForbidden !== null) {
   console.log("Contains the word");
} else {
   console.log("Does not contain the word");
}

How could I make the above work in python?

2
  • 1
    Just to stir up the pot... If you're optimization-crazy, you could use University|A[BGS]|Ltd|KB or Ltd|University|[AK]B|A[GS]... But when you need to change it, that's a hell to maintain! :) Commented Jun 12, 2014 at 7:14
  • 1
    Beware, what you are actually searching is a string pattern and not a true word : it will find following string"aa aABb cc" where a real word search should not. Commented Jun 12, 2014 at 7:22

4 Answers 4

9
import re
strg = "Hello test AB"
#str is reserved in python, so it's better to change the variable name

forbiddenwords = re.compile('AB|AG|AS|Ltd|KB|University') 
#this is the equivalent of new RegExp('AB|AG|AS|Ltd|KB|University'), 
#returns a RegexObject object

if forbiddenwords.search(strg): print 'Contains the word'
#search returns a list of results; if the list is not empty 
#(and therefore evaluates to true), then the string contains some of the words

else: print 'Does not contain the word'
#if the list is empty (evaluates to false), string doesn't contain any of the words
Sign up to request clarification or add additional context in comments.

1 Comment

It would be nice if you could add a little more explanation to your answer to help the person asking the question understand why this might help him.
6

You can use re module. Please try below code:

import re
exp = re.compile('AB|AG|AS|Ltd|KB|University')
search_str = "Hello test AB"
if re.search(exp, search_str):
  print "Contains the word"
else:
  print "Does not contain the word"

Comments

3
str="Hello test AB"
to_match=["AB","AG","AS","Ltd","KB","University"]
for each_to_match in to_match:
    if each_to_match in str:
        print "Contains"
        break
else:
    print "doesnt contain"

Comments

2

You can use findall to find all matched words:

import re

s= 'Hello Ltd test AB ';

find_result = re.findall(r'AB|AG|AS|Ltd|KB|University', s)

if not find_result:
    print('No words found')    
else:
    print('Words found are:', find_result)

# The result for given example s is
# Words found are: ['Ltd', 'AB']

If no word is found, that re.findall returns empty list. Also its better not to use str as a name of veritable, since its overwriting build in function in python under the same name.

3 Comments

if re.findall(needles, haystack): print 'Found' else: 'No luck' would be more inline with what the OP's original javascript.
Is there a way to make this only match if the word AB matches? Currently it also matches if AB is somewhere in another word
@Alosyius The simple solution would: find_result = re.findall(r'\s+(AB|AG|AS|Ltd|KB|University)\s+', s). This assumes that there are spaces around a world. Obviously for periods, comas, and all other possible ways a word can appear in a sentence, something more complicated is needed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.