6

I have been struggling with this for an embarrassingly long time so I have come here for help.

I want to match all strings that have a number followed by an optional dash followed by more numbers.

Example:

#Match
1
34-1
2-5-2
15-2-3-309-1

# Don't match
1--
--
#$@%^#$@#
dafadf
10-asdf-1
-12-1-

I started with this regex (one or more digits, followed optionally by a dash and one and more digits):

\d+(-\d+)*

That didn't work. Then I tried parenthesizing around the \d:

(\d)+(-(\d)+)*

That didn't work either. Can anybody help me out?

1
  • The problem is actually in your definition... you need to be more specific. You specify 1--, 10-asdf-1, and (maybe) -12-1- as "Don't Match', but at least the first two definitely match your text description, as well as the regex. The regex does match your text description, but not apparently what you really want. Commented Feb 24, 2014 at 16:54

4 Answers 4

6

You can use:

^(\d+(?:$|(?:-\d+)+))

See it work here.

Or, Debugex version of the same regex:

^(\d+(?:$|(?:-\d+)+))

Regular expression visualization

Debuggex Demo

Perhaps even a better alternative since it is anchored on both ends:

^(\d+(?:-\d+)*)$

Regular expression visualization

Debuggex Demo

Make sure that you use the right flags and re method:

import re

tgt='''
#Match
1
34-1
2-5-2
15-2-3-309-1

# Don't match
1--
--
#$@%^#$@#
dafadf
10-asdf-1
-12-1-
'''

print re.findall(r'^(\d+(?:-\d+)*)$', tgt, re.M)
# ['1', '34-1', '2-5-2', '15-2-3-309-1']
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for your answer! Wow, I did not realize it would be this complicated...
Actually, it is not that complicated once you can get in the habit of speaking in character group that you want to match: I want to match a line (OK, ^.*$) that starts with digits (OK, ^\d+$) and, optionally has a hyphen then more digits (OK, trickiest part, but ^(\d+(?:-\d+)*)$)
That's fair. I actually hadn't even heard of character groups and lookaheads and stuff like that until these answers here. I'll study them a bit more. Thanks for the help.
Wait! One last thing: in your regex, the ?: bit doesn't seem to be necessary right? It could've just been: ^(\d+(-\d+)*)$
Good catch. That is true, but it makes it a little clearer (IMHO) that you do not intend to capture the -\d+ groups; only that you want to test for presence or 0 or more with the *. It is a matter of style really -- and in Perl at least a matter of slight performance improvement.
1

Here's a regex I constructed that covers all your positive test cases; the ruleset is python:

^(?=\d)([-\d]+)*(?<=\d)$

Regular expression visualization

Debuggex Demo

Basically, there's a lookahead to make sure it starts with a number at the start. There's a lookbehind to make sure it ends with a number, too, and each capturing group inbetween is consisting strictly of digits and hyphens.

1 Comment

Thank you for your answer! Wow, I did not realize it would be this complicated...
1

This should do it:

^((?:\d+(?:-|$))+)$

Working regex example:

http://regex101.com/r/sD0oL7

Comments

1

Your original regex seems to work fine for the inputs you've given for examples, with one caveat: You need to be using either line-begin (^) and line-end($) anchors or specify full-line matching instead of string search which will implicitly use ^ and $ to enclose your regex. (i.e. re.match() vs. re.search() in Python)

The other examples all work fine, but the ^$ is what's really doing it.

Cheers.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.