Verify if provided string corresponds to pattern on python

Question

could you please advice how to verify in python if provided string correspond to provided pattern and return result.

For example the provided pattern is following:

< [prefix]-[id]> separated by ','>|< log >"

where prefix is any number of alphabetic characters, id is only numbers but not exceeding 5 digits, log is any number of any characters

examples:

proj-123|log message
proj-234, proj-345|log message

I suppose the easiest way is to apply regexp which I didn't use on python.

Thanks.

what's the desired output? You could just split on '|', and subsequently on ','. — telliott99
– telliott99, Commented Feb 3, 2010 at 11:49

SilentGhost · Accepted Answer · 2010-02-03 12:58:28Z

2

(?:[a-z]+-\d{1,5})(?:, [a-z]+-\d{1,5})*\|.*

it's not clear what you want to capture, that's why I use non-capturing groups. If you need only boolean:

>>> regex = '[a-z]+-\d{1,5}(?:, [a-z]+-\d{1,5})*\|.*'
>>> re.match(regex, 'proj-234, proj-345|log message') is not None
True

Of course, the same result can be achieved with the sequence of simple string methods:

prefs, _, log = subj.partition('|')
for group in prefs.split(', '):
    pref, _, id5 = group.partition('-')
    if id5.isdigit() and len(id5) <= 5 and pref.isalpha():
         print(pref, id5)

edited Feb 3, 2010 at 12:58

answered Feb 3, 2010 at 11:48

SilentGhost

322k67 gold badges312 silver badges294 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

interjay Over a year ago

That regex will actually match all strings. You need to escape the |, use re.match only, and change the ? to * to allow more than two prefix-id groups. I'd also use \s* instead of a space.

yart Over a year ago

I would like just to get boolean if provided string matches pattern or not. If I understood correctly then I need to put your pattern into re.match right?

gruszczy · Accepted Answer · 2010-02-03 11:48:31Z

0

Python has great regexp library in stdlib. Here is documentation. Simply use re.match and that should be all you need.

answered Feb 3, 2010 at 11:48

gruszczy

42.4k31 gold badges137 silver badges186 bronze badges

Comments

liwp · Accepted Answer · 2010-02-03 12:09:02Z

0

Extending SilentGhosts' excellent regexp...

The following will match more than two comma separated tags and it captures the tags in one group and the log message in another group:

import re

line = 'proj-234,proj-345,proj-543|log message'
match = re.match('((?:[a-zA-Z]+-\d{1,5})(?:,[a-zA-Z]+-\d{1,5})+)\|(.*)', line)
tags = match.group(1).split(',')
log_msg = match.group(2)

I wasn't able to figure out if it was possible to capture the tags following the first tag without capturing the comma, so I decided to capture them in one group and split them after the fact.

answered Feb 3, 2010 at 12:09

liwp

6,9562 gold badges30 silver badges41 bronze badges

Collectives™ on Stack Overflow

Verify if provided string corresponds to pattern on python

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related