0

Can you explain the semantics of following regex? Especially, what ?: and () means?

/(?:^|:|,)(?:\s*\[)+/g

Thanks.

1

4 Answers 4

1

?:^| means - match the beginning of the line or match, but don't capture the group or store back refs

?: means - don't capture the group or store back refs

Sign up to request clarification or add additional context in comments.

Comments

1

?: means a non-matching group. Usually () saves a reference for replacements.

So "abc".replace(/.(.)./, "x$1x") would return "xbx"

Then adding ?: treats them just as groups but not saving them for later.

Typical usage in html regex would be something like tag <.*(?:href|src)='"['"]

This looks for href or src attributes and then saves the value.

Comments

1

With the brackets () you can group some stuff together, e.g. for (?:\s*\[)+ the quantifier + (one or more) belongs to the whole group \s*\[, that means it will match more than one [ with optional whitespace before each square bracket.

Those groups are per default capturing groups, that means the matched part is put into a variable and can be reused using backreferences, or just as result.

This default behaviour can be changed by putting ?: after the opening bracket, so (?:\s*\[)+ is a non-capturing group, i.e. the matched part is not stored somewhere.

Comments

1

() tells the parser to capture the match so you can reference it.

?: inside the () tells the parser not to capture the match

This is the whole explanation:

Match the regular expression below «(?:^|:|,)»
   Match either the regular expression below (attempting the next alternative only if this one fails) «^»
      Assert position at the beginning of the string «^»
   Or match regular expression number 2 below (attempting the next alternative only if this one fails) «:»
      Match the character “:” literally «:»
   Or match regular expression number 3 below (the entire group fails if this one fails to match) «,»
      Match the character “,” literally «,»
Match the regular expression below «(?:\s*\[)+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   Match the character “[” literally «\[»

Read Regular Expression Advanced Syntax Reference for an explanation on this and other regex syntactic explanations.

2 Comments

You might want to explain why you would want to use a non-capturing group (to confine the scope of alternation operators, to have a unit that a quantifier can be applied to etc.)
@TimPietzcker I answered based on the OP's doubts. There can be any reason why you would want a non-capturing group, but the most common is because you don't want it to be one. In this case, it's pretty obvious that the parentheses are used for alternation operators on the first group, and to have a quantifier on the second group. It could as well have been a capturing group, there is no strict rule on that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.