0

I am trying to use c# Regular Expression to match a particular string of characters but I can not figure out how to do it. Any help is appreciated.

The string that I am trying to match is as follows, where A is an uppercase alpha character, X is an upper case alpha-numeric character and # is 0, 1 or 2.

AA-#-XX-X-XXX-XXXXXXX-XXXXXXXX

So any of the following would match the string above.

XY-1

MM-0-AB

MM-0-AB-1-ABC-1234567

VV-2-XX-7-CCC-ABCDEFG-12345678

Any any of the following would NOT match.

QQ-7-AA (Only 0, 1, 2 are allowed at the second level.)

QQ-2-XX-7-CC (Partial characters for that level.)

QQ-2-XX-7-CCC-ABCDEFG- (Can not end in a dash.)

QQ-2-XX-7-CCC-ABCDEFG-123456 (Partial characters for that level.)

So far (not that far really) I have as the pattern to match @"^[A-Z]{2}", but I am unsure how to match conditionally (I'm not even sure if conditionally is the proper term to use) the rest of the string, but only if it is there. Do I need to write 7 different statements for this? Seems unreasonable, but I could be wrong.

1 Answer 1

9

Have a look at the Regular Expression Language. You need the following elements:

  • uppercase alpha character: [A-Z]
  • upper case alpha-numeric character: [A-Z0-9]
  • 0, 1 or 2: [0-2]
  • dash: -

  • match x exactly n times: x{n}

  • match x zero or one time: x?
  • define a subexpression: (...)

Examples:

  • two uppercase alpha characters: [A-Z]{2}
  • two uppercase alpha characters, followed by a dash: [A-Z]{2}-
  • two uppercase alpha characters, followed by a dash, followed by 0, 1 or 2: [A-Z]{2}-[0-2]
  • two uppercase alpha characters, followed by a dash, followed by 0, 1 or 2, but with the subexpression consisting of the dash and 0, 1 or 2 occurring zero or one time:
    [A-Z]{2}(-[0-2])?
  • and so on...

Resulting expression:

^[A-Z]{2}(-[0-2](-[A-Z0-9]{2}(-[A-Z0-9](-[A-Z0-9]{3}(-[A-Z0-9]{7}(-[A-Z0-9]{8})?)?)?)?)?)?$
Sign up to request clarification or add additional context in comments.

3 Comments

I was so close myself, but you beat me to it. Good and well explained answer! +1
I too was too slow. I thought there was a way to shorten the A-Z0-9 (like using "\d" for the 0-9), but I can't find it. Closest I got was "\w", but that includes too many characters.
Thanks for the very thorough answer! I now understand the nesting of Regular Expressions!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.