0

I want to know what the difference is between these two regular expressions, what are the pro's and con's.

Example input (date) 31-12-2012.

Method A:
/(\d{2}-\d{2}-\d{4})/

And:

Method B:
^[0-9]{2}[-/][0-9]{2}[-/][0-9]{4}\$
2
  • 1
    Are you only after the \d vs. [0-9] difference? Because that starting ^ in (2) anchors it at the beginning (no such thing in 1, a date somewhere in any text will match), and an ending \$, meaning literal $, not end of line (remove the \ to get that). very important distinction to me... Also, capturing (()) vs. non-capturing, fixed - delimiter or both - and / recognized as delimiter.... A LOT of differences. Commented Feb 15, 2012 at 12:25
  • B should better be /[0-9]{2}-[0-9]{2}-[0-9]{4}/ to match the requirements. Commented Feb 15, 2012 at 12:26

5 Answers 5

2
  1. The first has delimiters /, the second one doesn't. For now, I assume that to be a copy/paste issue.
  2. B forces a date to occur as the first item in the string with ^, A just agrees with "a datestring anyw00-00-0000where in the string".
  3. A captures the date in match 1 by the extra (), B does no such thing. As the entire match will be the 0th item in a match, you could lose the unneeded ()'s.
  4. \d vs [0-9] -> see Avner's answer.
  5. A only matches - as the day/month/year separator. Use that if you only expect -. If you expect BOTH - AND /, use [-/] as in B.
  6. B wants the the date to end in $, A doesn't. Use the one which applies. If I assume this is a copy/paste error ($ being escaped because it is in a double quoted string for no good reason), it makes B match only a date because of the ^regex$ anchoring, and A a date string anywhere in the input. Once again, use the option that applies to your data.
  7. Neither of them validates a date. Only a format that kinda looks like one, but could as well not be one.
Sign up to request clarification or add additional context in comments.

Comments

2

Method B will accept slashes as well as dashes for the separator character. Otherwise, they are identical.

Also, be aware that Method B will accept:

31/12-2012 or 31-12/2012

The only con I can think of is that Method B will take up more disk space because it is a longer string.

1 Comment

Otherwise, they are identical => really? Extra super-sure really?
2

\d is pretty much identical to [0-9]. I can imagine for [0-9] there's a tiny tiny bit of more parsing involved, but this is negligible.

Then the only difference that's left is that Method B also parses:

31/12/2012

Comments

2

Theoretically, \d should catch more than just [0-9]. It should theoretically catch [۰-۹] as well (Arabic numerals), and any other numeric format that the Unicode standard includes as "digits". This can include ancient Greek and Roman numerals, counting rods, east-Asian characters, irrational numbers and Hexadecimal digits. Really.

In practice, I think most regex parsers don't handle these properly, from the several online regex tools I tested.

Comments

1

Method A will have back-reference 1 (or $1, or \1 - whatever the language is) since this regex is wrapped with ()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.