Regex - find specific number in string

Question

I'm struggling with regular expression whole day and couldn't find a solution. I'm trying to find some specific number in strings that contains numbers, semicolons, colons and whitespaces.

For our purpose let's say I'm looking for number 1234

Here are few examples which should match (Every line is a different string):

1234
;1234;
1234 : 5678
;1234,3321

And example that shouldn't match (because it's different number):

;12345;
0123456

My current attempt:

[^(0-9*)]1234[^(0-9*)]

Here is a permalink to Regex Tester with my problem: Regex Tester fiddle

In which language or tool are you going to use the pattern eventually? Also, this is not how character classes word, you are looking for negative lookarounds — Martin Ender
– Martin Ender, Commented Aug 27, 2013 at 14:18
[^(0-9*)] means not a digit (0-9), parentheses (( or )) or a star *. You may want to use simply [^0-9] (not a digit). — Bernhard Barker
– Bernhard Barker, Commented Aug 27, 2013 at 14:26

Simon Sagi · Accepted Answer · 2013-08-27 14:23:26Z

4

Maybe try this: ([^0-9]|^)1234([^0-9]|$) In this case you don't need the lookaround features.

You can use this to understand regexp better. It has a nice gui to visualize the pattern. Debuggex

answered Aug 27, 2013 at 14:23

Simon Sagi

4942 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Martin Ender Over a year ago

Note that this will fail to match the number at the beginning or end of the string.

Jay Over a year ago

The '1234' will be taken as repeat count for the set

Jay Over a year ago

In some implementations it will. not all use {n} for repeat count

cHao Over a year ago

@Jay: You have an example of one that doesn't? Every regex implementation i've ever seen that allows repeat counts, uses either {} or \{\} to delimit them.

Martin Ender Over a year ago

@Jay the one used by the OP's regex tester (ECMAScript) doesn't. Neither do most popular (Perl-based or POSIX-based ones). I'd be very interested in a counterexample though.

|

Jens Erat · Accepted Answer · 2013-08-27 14:27:59Z

3

If your flavor supports lookahead and lookbehind, go with this:

(?<!\d)1234(?!\d)

Lookaround tests for occurences of characters without matching them. Negative lookaround only accepts when there is no occurence.

If it supports word boundaries:

\b1234\b

Word boundaries include eg. whitespace and punctuation.

Otherwise check for non-digit characters and add string start and end:

(^|\D)1234($|\D)

If your engine does not even support \d and \D, replace them by [0-9] respective [^0-9].

edited Aug 27, 2013 at 14:27

answered Aug 27, 2013 at 14:22

Jens Erat

39k16 gold badges86 silver badges99 bronze badges

3 Comments

Martin Ender Over a year ago

^ and $ in a character class are just literal characters. You need to use the anchors outside of it with an alternation like (?:^|\D)...(?:\D|$)

Jens Erat Over a year ago

Aye, was too fast -- edited my post, but omitted the non-matching groups as this was no requirement, but makes the query more complicated.

Martin Ender Over a year ago

Sure, that's fine. I just think, even though it clutters up the pattern, it's one of the most important regex habits to get into. Because someone who does know that normal parentheses do capture might otherwise be confused what we are capturing for - so I always try to be explicit and avoid any unnecessary overhead at the same time.

Jay · Accepted Answer · 2013-08-27 14:24:35Z

0

This might work:

.*[^0-9]*[1][2][3][4][^0-9]*.*

How it works:

.*             anything
[^0-9]*        an optional character that is not a number
[1][2][3][4]   "1234" done this way because it will be taken as a repeat count unless escaped
[^0-9]*        an optional character that is not a number
.*             anything

There might be an issue with strings that start or end with "1234" and have no other characters. The match for anything on the front and back may not be needed depending on the implementation of regex.

answered Aug 27, 2013 at 14:24

Jay

14.5k5 gold badges47 silver badges74 bronze badges

4 Comments

Martin Ender Over a year ago

Since the [^0-9] are optional, the .* can match everything except the 1234 (including adjacent numbers). Also there's no need to put the numbers in classes.

Jay Over a year ago

They will be taken as a repeat count unless escaped

Martin Ender Over a year ago

No they won't. that's {1234}

cHao Over a year ago

No, they won't. Repeat counts are in {}.

Collectives™ on Stack Overflow

Regex - find specific number in string

3 Answers 3

7 Comments

3 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

3 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related