0

I want to create a regex in python which includes string variables to find the presence of those strings in the input sentence.

For example:

Input sentences:

In 2009, I was in Kerala. I love that place.

String 1 = I, String 2 = was

This should return me the sentence which contains I and was in that order. ie, In 2009, I was in Kerala should be returned. String1 and String2 can be anywhere in the sentence but String2 should come only after String1.

This is what I did so far:

r'([ A-Za-z0-9]*)'+string1+'([^\.!?]*)'+string2+'([^\.!?]*[\.!?])'

The problem is it detects I in in also. I don't want that. I want exactly String1 and String2.

Can anyone throw some idea on how to do this ?

2
  • Try putting \bs around the strings. E.g. r'...\b'+string1+r'\b...\b'+string2+r'\b...' Commented Jul 16, 2016 at 6:30
  • \b means "word boundary." Commented Jul 16, 2016 at 6:30

1 Answer 1

3

Based on my suggestion in the comment above, just adding \bs around the strings. I've left the rest of your regular expression as-is:

r'([ A-Za-z0-9]*)\b{string1}\b([^\.!?]*)\b{string2}\b([^\.!?]*[\.!?])'.format(
    string1=string1, string2=string2)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.