2

OK, so I am trying to split a String by ", " which are not inside '[' or ']'. I have a working RegEx for JavaScript but have been unable to convert it to Java syntax.

JS RegEX:

/,(?![^[]*])/g

Example sentence:

ex1 , [ex2 , ex3 ] , ex 4 , ex 4, [ex , ex ]

When I try and use the RegEx in Java (under Eclipse) I get an error saying:

Unclosed character class near index 10 ,(?![^[]*])

All I did was remove the '/' at the beginning and the "/g" at the end and I have been unable to translate the Syntax.

What would be the best way to achieve this?

1 Answer 1

4

Update for nested square bracket support

Since you need to also support nested square brackets, and the comma should be ignored inside the square brackets, you need a simple parser to collect the chunks of text you need.

public static List<String> splitWithCommaOutsideBrackets(String input) {
    int BracketCount = 0;
    int start = 0;
    List<String> result = new ArrayList<>();
    for(int i=0; i<input.length(); i++) {
        switch(input.charAt(i)) {
        case ',':
            if(BracketCount == 0) {
                result.add(input.substring(start, i).trim());// Trims the item!
                start = i+1;
            }
            break;
        case '[':
            BracketCount++;
            break;
        case ']':
            BracketCount--;
            if(BracketCount < 0) 
                return result; // The BracketCount shows the [ and ] number is unbalanced
            break;
        }
    }
    if (BracketCount > 0)
        return result; // Missing closing ]
    result.add(input.substring(start).trim()); // Trims the item!
    return result;
}

And use it as

String s = "ex1 , [ex2 , ex3 ] , [ hh3 , rt5 , w3 [ bn7 ] ] , ex 4 , ex 4, [ex , ex ]";
List<String> res = splitWithCommaOutsideBrackets(s);
for (String t: res) {
    System.out.println(t);
} 

Output of the sample Java code:

ex1
[ex2 , ex3 ]
[ hh3 , rt5 , w3 [ bn7 ] ]
ex 4
ex 4
[ex , ex ]

Note that trimming items is not necessary.

Also, where I return result, you may want to add code throwing an exception rather than returning the result as it is at that moment.

Original answer

In Java character classes, ] and [ must be escaped, unlike in JavaScript where you only have to escape ] symbol (inside the character class).

String pat = ",(?![^\\[]*])";
                    ^^

Here is an IDEONE demo:

String s = "ex1 , [ex2 , ex3 ] , ex 4 , ex 4, [ex , ex ]";
String pat = ",(?![^\\[]*])";
String[] result = s.split(pat);
System.out.println(Arrays.toString(result));

Note that neither in Java, nor in JS, the ], outside the character class, does not have to be escaped.

Sign up to request clarification or add additional context in comments.

5 Comments

Note that the original OP pattern matches a , that is not followed with 0+ characters other than [ followed with ]. Perhaps, a safer pattern would look like ",(?![^\\[\\]]*])", but a CSV parser should work best for such strings.
The RegEx you posted ",(?![^\[\]]*])" works fine if there is only one pair of brackets, ex. var=[some,list,here] but if there are any other brackets inside those brackets the RegEx will break ex. var=[some,list,here[something]]. Do you know how this can be fixed?
Yes and no, depends on the string itself. One thing is certain: you cannot use a Java regex to match nested constructs.
are you sure about that?
Java regex does not support recursion in regular expressions. It is a fact. Nor does it support balanced constructs as .NET regexes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.