0

I'm having a bit of an issue with my regex script and hopefully somebody can help me out.

Basically, I have a regex script that I use re.findall() with in a python script. My goal is to search various strings of varying length for references to Bible verses (e.g. John 3:16, Romans 6, etc). My regex script mostly works, but it sometimes tacks on an extra whitespace before the Bible book name. Here's the script:

versesToFind = re.findall(r'\d?\s?\w+\s\d+:?\d*', str)

To hopefully explain this problem better, here's my results when running this script on this text string:

str = 'testing testing John 3:16 adsfbaf John 2 1 Kings 4 Romans 4'

Result (from www.pythonregex.com):

[u' John 3:16', u' John 2', u'1 Kings 4', u' Romans 4']

As you can see, John 2 and Romans 4 has an extra whitespace at the beginning that I want to get rid of. Hopefully my explanation makes sense. Thanks in advance!

3 Answers 3

1

You can make the digit and space optional as a single unit by grouping with parens (?: just to specify it's non-capturing),

'(?:\d\s)?\w+\s\d+:?\d*'
 ^^^    ^

Which produces,

>>> s = 'testing testing John 3:16 adsfbaf John 2 1 Kings 4 Romans 4'
>>> re.findall(r'(?:\d\s)?\w+\s\d+:?\d*', s)
['John 3:16', 'John 2', '1 Kings 4', 'Romans 4']
Sign up to request clarification or add additional context in comments.

2 Comments

That works perfectly, thank you! I was trying to do exactly this but I didn't know the syntax for it.
@Matthieu Glad to help! Good to refer to docs.python.org/2/library/re.html
0

Using list comprehension you can do it in a single line:

versesToFind = [x.strip() for x in re.findall(r'\d?\s?\w+\s\d+:?\d*', str)]

Comments

0

Instead of rewriting your regular expression, you can always just strip() the whitespace:

>>> L = [u' John 3:16', u' John 2', u'1 Kings 4', u' Romans 4']
>>> print map(unicode.strip, L)
[u'John 3:16', u'John 2', u'1 Kings 4', u'Romans 4']

map() here is just identical to:

>>> print [i.strip() for i in L]
[u'John 3:16', u'John 2', u'1 Kings 4', u'Romans 4']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.