What is word boundary while using regex in python [duplicate]

Question

What is a word boundary in a Python regex? Can someone please explain this on these examples:

Example 1

>>> x = '456one two three123'
>>> y=re.search(r"\btwo\b",x)
>>> y
<_sre.SRE_Match object at 0x2aaaaab47d30>

Example 2

>>> y=re.search(r"two",x)
>>> y
<_sre.SRE_Match object at 0x2aaaaab47d30>

Example 3

>>> ip="192.168.254.1234"
>>> if re.search(r"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b",ip):
...    print ip
...

Example 4

>>> ip="192.168.254.1234"
>>> if re.search(r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}",ip):
...    print ip
192.168.254.1234

The documentation has the answer: docs.python.org/library/re.html#regular-expression-syntax — David Heffernan
– David Heffernan, Commented Apr 13, 2012 at 8:58
Instead of us explaining how four examples are working, why don't you ask about what you don't understand? For example what output where you expecting and what instead come out? — Rik Poggi
– Rik Poggi, Commented Apr 13, 2012 at 8:58
I want to know why \b is required....If i do not give the examples every one comment that u have not tried,if i give examples some person asks "why don't you ask about what you don't understand?" :) Distributed set of people looking at the posts :) — Rajeev
– Rajeev, Commented Apr 13, 2012 at 9:08
If I put regex \b into Google, I get regular-expressions.info/wordboundaries.html as the first result. — Karl Knechtel
– Karl Knechtel, Commented Apr 13, 2012 at 9:16

Karl Knechtel · Accepted Answer · 2012-04-13 09:13:04Z

15

"word boundary" means exactly what it says: the boundary of a word, i.e. either the beginning or the end.

It does not match any actual character in the input, but it will only match if the current match position is at the beginning or end of the word.

This is important because, unlike if you just matched whitespace, it will also match at the beginning or end of the entire input.

So '\bfoo' will match 'foobar' and 'foo bar' and 'bar foo', but not 'barfoo'.

'foo\b' will match 'foo bar' and 'bar foo' and 'barfoo', but not 'foobar'.

answered Apr 13, 2012 at 9:13

Karl Knechtel

61.4k14 gold badges131 silver badges193 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

HWende Over a year ago

Please note that in these examples the result of the match will always only contain 'foo' from e.g. 'foo bar' and so on. Just to make this clear.

Karl Knechtel Over a year ago

Yes. Also, "match" is actually imprecise, as you'd have to use re.search to get a positive result for the strings not starting with foo.

Stevoisiak Over a year ago

What characters are considered for word boundaries? Would foo\b match foo-bar, foo_bar, foo=bar, or foo.bar?

Karl Knechtel Over a year ago

@Stevoisiak I'm not sure that I knew that confidently in 2012, although I certainly could have researched and tested it. That said, your comment drew my attention to the fact that this question is a duplicate. The canonical, which I have now used to close this question as a duplicate, includes answers that explain the matter very well.

HWende · Accepted Answer · 2012-04-13 09:12:26Z

-1

Try this:

ip="192.168.254.1234"
res = re.findall("\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}",ip)
print(res)

Notice how I correctly escaped the dots. The ip is found because the regex doesn't care what comes after the last 1-3 digits.

Now:

ip="192.168.254.1234"
res = re.findall("\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b",ip)
print(res)

This will not work, since the last 1-3 digits are NOT ENDING AT A BOUNDARY.

answered Apr 13, 2012 at 9:12

HWende

1,7254 gold badges20 silver badges31 bronze badges

2 Comments

Rajeev Over a year ago

Matching the dot was a edit mistake please dont mind.I have corrected it now

smci Over a year ago

This answer doesn't address the revised question by OP, suggest you delete it.

Collectives™ on Stack Overflow

What is word boundary while using regex in python [duplicate]

Example 1

Example 2

Example 3

Example 4

2 Answers 2

4 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Example 1

Example 2

Example 3

Example 4

2 Answers 2

4 Comments

2 Comments

Linked

Related