1

I'm using a function called findlist to return a list of all the positions of a certain string within a text, with regex to look for word boundaries. But I want to ignore the character ( and only consider the other word boundaries, so that it will find split in var split but not in split(a). Is there any way to do this?

import re

def findlist(input, place):
    return [m.span() for m in re.finditer(input, place)]

str = '''
var a = 'a b c'
var split = a.split(' ')
'''
instances = findlist(r"\b%s\b" % ('split'), str)

print(instances)
0

1 Answer 1

2

You may check if there is a ( after the trailing word boundary with a negative lookahead (?!\():

instances = findlist(r"\b{}\b(?!\()".format('split'), s)
                             ^^^^^^ 

The (?!\() will trigger after the whole word is found, and if there is a ( immediately to the right of the found word, the match will be failed.

See the Python demo:

import re

def findlist(input_data, place):
    return [m.span() for m in re.finditer(input_data, place)]

s = '''
var a = 'a b c'
var split = a.split(' ')
'''
instances = findlist(r"\b{}\b(?!\()".format('split'), s)

print(instances) # => [(21, 26)]
Sign up to request clarification or add additional context in comments.

2 Comments

Is there a way to only match if there is a following (?
@RobKwasowski Turn the negative lookahead into a positive one, (?=\().

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.