4

I have a string that follows the pattern of a 1+ numbers followed by a single letter, 'a', 'b', 'c'. I want to split the string after every letter.

some_function('12a44b65c')
>>> ['12a', '44b', '65c']

enter image description here

I've tried so far

re.split('([abc]\d+)', '12a44b65c')
>>> ['12', 'a44', '', 'b65', 'c']
1
  • Try swapping the patterns: re.findall(r'\d+[abc]', '12a44b65c') Commented May 23, 2016 at 20:25

2 Answers 2

4

Your regex is backwards - it should be any number of digits followed by an a, b or a c. additionally, I wouldn't use split, which returns annoying empty strings, but findall:

>>> re.findall('(\d+[abc])', '12a44b65c')
['12a', '44b', '65c']
Sign up to request clarification or add additional context in comments.

2 Comments

\d* will also match with no digits given. You have to use \d+ to meet the 1+ digits requirement.
Besides, there is no need in the capturing group.
1

If you're able to use the newer regex module, you can even split on zero-width matches (with lookarounds, that is).

import regex as re

rx = r'(?V1)(?<=[a-z])(?=\d)'
string = "12a44b65c"
parts = re.split(rx, string)
print parts
# ['12a', '44b', '65c']

This approach looks for one of a-z behind and a digit (\d) immediately ahead.
The original re.split() does not allow zero-width matches, for compatibility you explicitely need to turn the new behaviour on with (?V1) in the pattern.
See a demo on regex101.com.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.