Python Regex: why does this not work?

Question

Why does this not work?

re.sub('\\b[a@](\\W|[a@])*[s5$](\\W|[s5$])*[s5$](\\W|[s5$])*($|\\W)', '*', '@ss')

I do not see why @ss is not replaced by *. Similarly, @55 is not replaced.

These are replaced: a55, a5s, as5, ass

Thank you!

just clarifying that s looks like 5 and $ and 'a' looks like @ — Squall Leohart
– Squall Leohart, Commented Aug 16, 2012 at 23:27
Wouldn't re.sub(r'[a@][s5$]{2}', '*', '@ass') be much simpler and give the same result, or am I missing something? — BrtH
– BrtH, Commented Aug 16, 2012 at 23:36
yes, that would work. but iam writing a general regex that would work for everything :) — Squall Leohart
– Squall Leohart, Commented Aug 16, 2012 at 23:40

Kendall Frey · Accepted Answer · 2012-08-16 23:29:55Z

2

It's because @ is not a word character, and thus the first \b is not matched.

This is my suggestion:

re.sub('(\\ba|@)(\\W|[a@])*[s5$](\\W|[s5$])*[s5$](\\W|[s5$])*($|\\W)', '*', '@ss')

(Replacing \b[a@] with (\ba|@))

answered Aug 16, 2012 at 23:29

Kendall Frey

44.6k21 gold badges113 silver badges151 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

RocketDonkey · Accepted Answer · 2012-08-16 23:29:20Z

0

You don't have a pair of parentheses around the first section. Try this:

re.sub('(\\b[a@])*(\\W|[a@])*[s5$](\\W|[s5$])*[s5$](\\W|[s5$])*($|\\W)', '*', '@ss')

answered Aug 16, 2012 at 23:29

RocketDonkey

37.4k8 gold badges83 silver badges84 bronze badges

Jon Clements · Accepted Answer · 2012-08-16 23:43:40Z

0

If you're trying a sort of "profanity" check - I would take the logic out the regex.

look_alike = {'@': 'A', '$': 'S'}
test_string = ''.join(look_alike.get(c, c) for c in your_string.upper()) # also look at `string.translate`

Then if 'ASS' in test_string - or similar with word boundaries using an re.

answered Aug 16, 2012 at 23:43

Jon Clements

143k34 gold badges254 silver badges288 bronze badges