The following example is taken from the python re documents
re.split(r'\b', 'Words, words, words.')
['', 'Words', ', ', 'words', ', ', 'words', '.']
'\b' matches the empty string at the beginning or end of a word. Which means if you run this code it produces an error.
(jupyter notebook python 3.6)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-128-f4d2d57a2022> in <module>
1 reg = re.compile(r"\b")
----> 2 re.split(reg, "Words, word, word.")
/usr/lib/python3.6/re.py in split(pattern, string, maxsplit, flags)
210 and the remainder of the string is returned as the final element
211 of the list."""
--> 212 return _compile(pattern, flags).split(string, maxsplit)
213
214 def findall(pattern, string, flags=0):
ValueError: split() requires a non-empty pattern match.
Since \b only matches empty strings, split() does not get its requirement "non-empty" pattern match. I have seen varying questions related to split() and empty strings. Some I could see how you may want to do it in practice, example, the question here. Answers vary from "just can't do it" to (older ones) "it's a bug".
My question is this:
Since this is still an example on the python web page, should this be possible? is it something that is possible in the bleeding edge release?
The question in the in the link above involved
re.split(r'(?<!foo)(?=bar)', 'foobarbarbazbar'), it was asked in 2015 and there was no way to accomplish the requirements with justre.split(), is this still the case?
\bdoes not make much sense. Note that with Python 3.7, you may split with zero length matches.['foobar', 'barbaz', 'bar']withre.split(r'(?<!foo)(?=bar)', 'foobarbarbazbar'in Python 3.7.