0

My string:

AA,$,DESCRIPTION(Sink, clinical),$

Wanted matches:

AA
$
DESCRIPTION(Sink, clinical)
$

My regex sofar:

\+d|[\w$:0-9`<>=&;?\|\!\#\+\%\-\s\*\(\)\.ÅÄÖåäö]+

This gives

AA
$
DESCRIPTION(Sink
clinical)

I want to keep matches between ()

https://regex101.com/r/MqFUmk/3

5
  • Please verify the string in the example! The desired output makes no sense! Commented Feb 15, 2017 at 20:29
  • Oops. Sorry. Fixed! Commented Feb 15, 2017 at 20:31
  • 1
    How many levels of parenthesis there could be? Could the string be like this: $, gg(ee(), yy), ee? Commented Feb 15, 2017 at 20:34
  • Maybe this with parenthesis instead could help stackoverflow.com/a/18147076/340760 Commented Feb 15, 2017 at 20:37
  • @BrunoLM That's a nice link, though I'd like to point out that the delimiters are commas, and the secondary delimiters are quotes (which OP could replace with parentheses... maybe) Commented Feb 15, 2017 at 20:53

4 Answers 4

1

Here's my attempt at the regex

\+d|[\w$:0-9`<>=&;?\|\!\#\+\%\-\s\*\.ÅÄÖåäö]+(\(.+\))?

I removed the parentheses from within the [ ] characters, and allowed capture elsewhere. It seems to satisfy the regex101 link you posted.

Depending on how arbitrary your input is, this regex might not be suitable for more complex strings.

Alternatively, here's an answer which could be more robust than mine, but may only work in Ruby.

((?>[^,(]+|(\((?>[^()]+|\g<-1>)*\)))+)
Sign up to request clarification or add additional context in comments.

1 Comment

Problem is if you add another ending parenthesis, it matches from the first parenthesis to the last one. I would suggest replacing the capture group with (\(.+?\))?, making the inner parenthesis match lazy. In this case, the problems becomes if you have inner parenthesis such as (a(b)c), which would not be matched correctly.
1

That one seems to work for me?

([^,\(\)]*(?:\([^\(\)]*\))?[^,\(\)]*)(?:,|$)

https://regex101.com/r/hLyJm5/2

Hope this helps!

Comments

1

Personally, I would first replace all commas within parentheses () with a character that will never occur (in my case I used @ since I don't see it within your inclusions) and then I would split them by commas to keep it sweet and simple.

myStr = "AA,$,DESCRIPTION(Sink, clinical),$";            //Initial string
myStr = myStr.replace(/(\([^,]+),([^\)]+\))/g, "$1@$2"); //Replace , within parentheses with @
myArr = myStr.split(',').map(function(s) { return s.replace('@', ','); }); //Split string on ,
//myArr -> ["AA","$","DESCRIPTION(Sink, clinical)","$"]

optionally, if you're using ES6, you can change that last line to:

myArr = myStr.split(',').map(s => s.replace('@', ','));  //Yay Arrow Functions!

Note: If you have nested parentheses, this answer will need a modification

Comments

0

At last take an aproximation of what you need:

\w+(?:\(.*\))|\w+|\$

https://regex101.com/r/MqFUmk/4

3 Comments

Nice and short. Spaces are not included though.
@PerStröm not includes, you dont need it?
[a-z\$A-Z]+(?:(.*))|[a-z\$A-Z]+ i think is more efficient

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.