Adding word boundary to python regex capture group

Question

I am trying to preprocess text before parsing them to StanfordCoreNLP server. Some of my text looks like this.

" Conversion of code written in C# to Visual Basic .NET (VB.NET)."

The ".NET" confuses the server because it appears as a period and makes the single sentence into two. I wanted to replace '.' that appears in front of a word with 'DOT' so that sentence remains the same. Note that I don't want to change anything in 'VB.NET' because the StanfordCoreNLP recognizes that as one word (Proper noun).

This is what I tried so far.

print(re.sub(r"\.(\S+)", r"DOT\g<0>", text))

The result looks like this.

Conversion of code written in C# to Visual Basic DOT.NET (VBDOT.NET).

I tried adding word boundaries to the pattern r"\b\.(\S+)\b". It didn't work.

Any help would be appreciated.

Does this answer your question? Reference - What does this regex mean? — wp78de
– wp78de, Commented Dec 10, 2020 at 22:44

Wiktor Stribiżew · Accepted Answer · 2020-12-09 22:40:36Z

1

You can use

re.sub(r"\B\.\b", "DOT", text)

See the regex demo.

The \B\.\b regex matches a dot that is either at the start of string or immediately preceded with a non-word char, and that is followed with a word char.

See the Python demo:

import re
text = "Conversion of code written in C# to Visual Basic .NET (VB.NET)."
print( re.sub(r"\B\.\b", "DOT", text) )
# => Conversion of code written in C# to Visual Basic DOTNET (VB.NET).

answered Dec 9, 2020 at 22:40

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

akalanka Over a year ago

Could you please explain why \b\.\b doesn't work.

Wiktor Stribiżew Over a year ago

@akalanka \b\.\b matches a . that is located in between two word chars, e.g. b.c, 1.a, _._.

Collectives™ on Stack Overflow

Adding word boundary to python regex capture group

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related