6

The Problem

I could match this string

(xx)

using this regex

\([^()]*\)

But it wouldn't match

(x(xx)x)

So, this regex would

\([^()]*\([^()]*\)[^()]*\)

However, this would fail to match

(x(x(xx)x)x)

But again, this new regex would

[^()]*\([^()]*\([^()]*\)[^()]*\)[^()]*

This is where you can notice the replication, the entire regex pattern of the second regex after the first \( and before the last \) is copied and replaces the center most [^()]*. Of course, this last regex wouldn't match

(x(x(x(xx)x)x)x)

But, you could always copy replace the center most [^()]* with [^()]*\([^()]*\)[^()]* like we did for the last regex and it'll capture more (xx) groups. The more you add to the regex the more it can handle, but it will always be limited to how much you add.

So, how do you get around this limitation and capture a group of parenthesis (or any two characters for that matter) that can contain extra groups within it?

Falsely Assumed Solutions

I know you might think to just use

\(.*\)

But this will match all of

(xx)xx)

when it should only match the sub-string (xx).

Even this

\([^)]*\)

will not match pairs of parentheses that have pairs nested like

(xx(xx)xx)

From this, it'll only match up to (xx(xx).

Is it possible?

So is it possible to write a regex that can match groups of parentheses? Or is this something that must be handled by a routine?

Edit

The solution must work in the JavaScript implementation of Regular Expressions

5
  • 1
    so you want the brackets to be balanced on both the sides..right Commented Dec 15, 2012 at 5:08
  • Correct. Notice the edit also. Commented Dec 15, 2012 at 5:11
  • 1
    Some problems are not made for a regex. Use a simple state machine that walks through the string a character at a time. Commented Dec 15, 2012 at 5:18
  • @Sam: Nope, JS regular expression cannot do this task. Just parse the string normally with a stack. Commented Dec 15, 2012 at 5:21
  • Any links to examples of using a JS-based state machine to recursively parse strings? I'm struggling to find anything. Commented Jan 30, 2018 at 0:04

2 Answers 2

2

If you want to match only if the round brackets are balanced you cannot do it by regex itself..

a better way would be to

1>match the string using \(.*\)

2>count the number of (,) and check if they are equal..if they are then you have the match

3>if they are not equal use \([^()]*\) to match the required string

Sign up to request clarification or add additional context in comments.

1 Comment

Yup, I'd consider something like this. Be careful though! It's not enough to have the same number of ( and )! For instance, consider: )( or ( ) ) ( ( ).
0

Formally speaking, this isn't possible using regular expressions! Regular expressions define regular languages, and regular languages can't have balanced parenthesis.

However, it turns out that this is the sort of thing people need to do all the time, so lots of Regex engines have been extended to include more than formal regular expressions. Therefore, you can do balanced brackets with regular expressions in javascript. This article might help get you started: http://weblogs.asp.net/whaggard/archive/2005/02/20/377025.aspx . It's for .net, but the same applies for the standard javascript regex engine.

Personally though, I think it's best to solve a complex problem like this with your own function rather than leveraging the extended features of a Regex engine.

1 Comment

“It's for .net, but the same applies for the standard javascript regex engine.” No, JavaScript’s regular expressions don’t support the features used in the article.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.