0

could you please advice how to verify in python if provided string correspond to provided pattern and return result.

For example the provided pattern is following:

< [prefix]-[id]> separated by ','>|< log >"

where prefix is any number of alphabetic characters, id is only numbers but not exceeding 5 digits, log is any number of any characters

examples:

  1. proj-123|log message
  2. proj-234, proj-345|log message

I suppose the easiest way is to apply regexp which I didn't use on python.

Thanks.

1
  • what's the desired output? You could just split on '|', and subsequently on ','. Commented Feb 3, 2010 at 11:49

3 Answers 3

2
(?:[a-z]+-\d{1,5})(?:, [a-z]+-\d{1,5})*\|.*

it's not clear what you want to capture, that's why I use non-capturing groups. If you need only boolean:

>>> regex = '[a-z]+-\d{1,5}(?:, [a-z]+-\d{1,5})*\|.*'
>>> re.match(regex, 'proj-234, proj-345|log message') is not None
True

Of course, the same result can be achieved with the sequence of simple string methods:

prefs, _, log = subj.partition('|')
for group in prefs.split(', '):
    pref, _, id5 = group.partition('-')
    if id5.isdigit() and len(id5) <= 5 and pref.isalpha():
         print(pref, id5)
Sign up to request clarification or add additional context in comments.

2 Comments

That regex will actually match all strings. You need to escape the |, use re.match only, and change the ? to * to allow more than two prefix-id groups. I'd also use \s* instead of a space.
I would like just to get boolean if provided string matches pattern or not. If I understood correctly then I need to put your pattern into re.match right?
0

Python has great regexp library in stdlib. Here is documentation. Simply use re.match and that should be all you need.

Comments

0

Extending SilentGhosts' excellent regexp...

The following will match more than two comma separated tags and it captures the tags in one group and the log message in another group:

import re

line = 'proj-234,proj-345,proj-543|log message'
match = re.match('((?:[a-zA-Z]+-\d{1,5})(?:,[a-zA-Z]+-\d{1,5})+)\|(.*)', line)
tags = match.group(1).split(',')
log_msg = match.group(2)

I wasn't able to figure out if it was possible to capture the tags following the first tag without capturing the comma, so I decided to capture them in one group and split them after the fact.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.