Python Regex: Find patterns without repetitions

Question

I want to find patterns in string as follows,

a = "3. ablkdna 08. 15. adbvnksd 4."

The expected patterns are like below,

match = "3. "
match = "4. "

I want to exclude the patterns,

([0-9]+\.[\s]*){2,}

But only find the patterns of length 1. not 08. and 15..

How should I implement this?

tshiono · Accepted Answer · 2020-11-05 03:58:01Z

1

The following regex will work for given two examples:

import re
p = re.compile(r'(?<!\d\.\s)(?<!\d)\d+\.(?!\s*\d+\.)')
a = "3. ablkdna 08. 15. adbvnksd 4."
m = re.findall(p, a)
print(m)
# prints  ['3.', '4.']

a = "3. (abc), adfb 8. 1. adfg 4. asdfasd"
m = re.findall(p, a)
print(m)
# prints  ['3.', '4.']

Apparently the regex above is not complete and there are many exceptions to allow "false-positive".

In order to write a complete regex which excludes an arbitrary pattern, we will need to implement the absent operator (?~exp) which was introduced in Ruby 2.4.1 and not available in Python as of now.

As an alternative, how about a two step solution:

m = re.findall(r'\d+\.\s*', re.sub(r'(\d+\.\s*){2,}', '', a))

which may not be elegant.

answered Nov 5, 2020 at 3:58

tshiono

22.3k2 gold badges18 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python Regex: Find patterns without repetitions

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related