2

I need to parse a regex with regex. I have a regex string:

[a-z]{1}[-][0-9]{4}[-][ab]

The actual regex for parsing the string above that I came up with and which almost works is:

/(?|\[(.*?)\]\{(.*?)\}|\[(.*?)\](.*?))/g

What does it do can be seen in this regex101 example and the error here is in the Match 2 and its Group 1 (-][0-9, which should be just -).


The goal is to match everything inside of square brackets [] followed by a number inside curly brackets {}. If curly brackets {} after square brackets [] are missing it should fill it with null and this is what alternative group is doing with branch reset group. Also if just square brackets followed by a square brackets, then it's expected to act as later as well (match what's on the inside of square brackets [] and fill Group 2 with null).

The problem that my regex doesn't stop on third [-] and matches it upto -][0-9 instead of matching just - and then starting with parsing [0-9]{4}.

The expected match should be:

[a-z]{1}
a-z
1

[-]
-
null

[0-9]{4}
0-9
4

[-]
-
null

[ab]
ab
null

The current match is incorrect and is as follows:

[a-z]{1}
a-z
1

[-][0-9]{4}
-][0-9
4

[-]
-
null

[ab]
ab
null

What am I missing?

5
  • 1
    Is this what you expect? \[([^]]*)](?:\{(\d+)\})?. If the quantifier inside the {} is missing, there will be no group 2 for that match Commented Dec 25, 2021 at 13:54
  • Yes, this is very close, although Group 2 must always present, if missing, it should be set to null then. Please make it an answer. Commented Dec 25, 2021 at 14:00
  • 1
    I see, I was missing this part [([^]]*)\]. Thanks! Commented Dec 25, 2021 at 14:04
  • 1
    With this, you will always get group 1 and 2. But, in group 2, the curly brackets will also be captured :( Commented Dec 25, 2021 at 14:13
  • Yes, this will work too. Thank you for your help! Please make it an answer. Commented Dec 25, 2021 at 14:36

1 Answer 1

1

This regex should work:

\[([^]]*)](\{\d+\}|)

Demo

Explanation:

  • \[ - matches [
  • ([^]]*) - matches 0+ occurrences of any character that is not a ] and captures this submatch in group 1
  • ] - matches ]
  • (\{\d+\}|) - either matches nothing OR a { followed by 1+ digits followed by }. Whatever is matched is stored in Group 2
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.