4

I have trouble splitting string based on regex.

String str = "1=(1-2,3-4),2=2,3=3,4=4";
Pattern commaPattern = Pattern.compile("\\([0-9-]+,[0-9-]+\\)|(,)") ;
String[] arr = commaPattern.split(str);
for (String s : arr)
{
    System.out.println(s);
}

Expected output,

1=(1-2,3-4)     
2=2    
3=3    
4=4

Actual output,

1=

2=2
3=3
4=4
2
  • 3
    Regex isn't going to solve it for you. You need a parser. You need a parser. Commented Mar 29, 2013 at 7:52
  • 1
    @Bohemian there is no need of parser for such a simple problem..parser would be an overkill Commented Mar 29, 2013 at 8:08

4 Answers 4

6

This regex would split as required

,(?![^()]*\\))
  ------------
      |->split with , only if it is not within ()
Sign up to request clarification or add additional context in comments.

2 Comments

Can you explain it a little bit? For example, why \\) is considered as a part of the Quantifiers instead of a Character? Why ,?!([^()]*\\)) won't work? Thanks!
@JingHe From what I can recall, the , is being matched via lookahead. By lookahead, the regex engine wont consume characters, its like a boolean condition, so instead of consuming characters and moving ahead, it will check if the condition is true or not..This is required because if you try to split without lookaheads, it will also split on the matched content other then ,``
3

This isn't well suited for a split(...). Consider scanning through the input and matching instead:

String str = "1=(1-2,3-4),2=2,3=3,4=4";

Matcher m = Pattern.compile("(\\d+)=(\\d+|\\([^)]*\\))").matcher(str);

while(m.find()) {
  String key = m.group(1);
  String value = m.group(2);
  System.out.printf("key=%s, value=%s\n", key, value);
}

which would print:

key=1, value=(1-2,3-4)
key=2, value=2
key=3, value=3
key=4, value=4

Comments

1

You will have to use some look ahead mechanism here. As I see it you are trying to split it on comma that is not in parenthesis. But your regular expressions says:

Split on comma OR on comma between numbers in parenthesis 

So your String gets splitted in 4 places 1) (1-2,3-4) 2-4) comma

Comments

-4
String[] arr = commaPattern.split(str);

should be

String[] arr = str.split(commaPattern);

1 Comment

No, because commaPattern is not a regex string. Even if you pass string this wouldn't solve the problem with regular expression.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.