0

I have the following complete code example

import re

examples = [
    "D1",       # expected: ('1')
    "D1sjdgf",  # ('1')
    "D1.2",     # ('1', '2')
    "D1.2.3",   # ('1', '2', '3')
    "D3.10.3x", # ('3', '10', '3')
    "D3.10.11"  # ('3', '10', '11')
]

for s in examples:
    result = re.search(r'^D(\d+)(?:\.(\d+)(?:\.(\d+)))', s)
    print(s, result.groups())

where I want to match the 1, 2 or 3 numbers in the expression always starting with the letter "D". It could be 1 of them, or 2, or three. I am not interested in anything after the last digit.

I would expect that my regex would match e.g. D3.10.3x and return ('3','10','3'), but instead returns only ('3',). I do not understand why.

^D(\d+\)(?:\.(\d+)(?:\.(\d+)))

  • ^D matches "D" at the start
  • \d matches the first one-digit number inside a group.
  • (?: starts a non-matching group. I do not want to get this group back.
  • \. A literal point
  • (\d+) A group of one or more numbers I want to "catch"

I also do not know what a "non-capturing" group means in that context as for this answer.

10
  • [\.(\d+)+] is a character class that has no group, [\.(\d+)+] = [.)(+\d]. It should be (?:\.(\d+))?, i.e. you must use an optional non-capturing group instead of a character class. Commented Aug 11, 2022 at 15:21
  • question updated Commented Aug 11, 2022 at 15:21
  • 1
    Try ^D(\d+)(?:\.(\d+)(?:\.(\d+))?)? Commented Aug 11, 2022 at 15:23
  • 2
    The other answers do not seem to answer my question. Commented Aug 11, 2022 at 15:27
  • 1
    Ok I removed the rectangular brackets. Still does not work. Even with non-capturing groups. Should I create a new question what will get closed without valid reason? Commented Aug 11, 2022 at 15:29

1 Answer 1

1

You may use this regex solution with a start anchor and 2 capture groups inside the nested optional capture groups:

^D(\d+)(?:\.(\d+)(?:\.(\d+))?)?

RegEx Demo

Explanation:

  • ^: Start
  • D: Match letter D
  • (\d+): Match 1+ digits in capture group #1
  • (?:: Start outer non-capture group
    • \.: Match a dot
    • (\d+): Match 1+ digits in capture group #2
    • (?:: Start inner non-capture group
      • \.: Match a dot
      • (\d+): Match 1+ digits in capture group #3
    • )?: End inner optional non-capture group
  • )?: End outer optional non-capture group

Code Demo:

import re

examples = [
    "D1",       # expected: ('1')
    "D1sjdgf",  # ('1')
    "D1.2",     # ('1', '2')
    "D1.2.3",   # ('1', '2', '3')
    "D3.10.3x", # ('3', '10', '3')
    "D3.10.11"  # ('3', '10', '11')
]

rx = re.compile(r'^D(\d+)(?:\.(\d+)(?:\.(\d+))?)?')

for s in examples:
    result = rx.search(s)
    print(s, result.groups())

Output:

D1 ('1', None, None)
D1sjdgf ('1', None, None)
D1.2 ('1', '2', None)
D1.2.3 ('1', '2', '3')
D3.10.3x ('3', '10', '3')
D3.10.11 ('3', '10', '11')
Sign up to request clarification or add additional context in comments.

1 Comment

Sorry, I used your suggestion in some other regex. Seems to work. And maybe I even understand it a bit how it works. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.