3

I can split a string into two based on 2 spaces:

string Line = "1  2";

Regex.Split(Line, "  ");

=> 1, 2

I would like to add an exception. Only split if 'not enclosed by [ ]' as shown in this example.

string Line = "1  2  [1  2]";

Regex.Split(Line, "  ");

=> 1, 2, [1 2]

Can I fairly easily achieve this via regex? By the way, I use .NET.

4
  • 3
    Is it just numbers/digits? OR will it be other things like 1 2 hello [how are] you? Commented Nov 16, 2012 at 15:44
  • Intuition tells me that that's a problem beyond the scope of regex Commented Nov 16, 2012 at 15:48
  • And will you have nesting? 1 2 [1 2 [1 2]] 3 4 what should that produce? Commented Nov 16, 2012 at 15:49
  • There is no nesting and I only expect numbers. The answer below looks nice. Commented Nov 16, 2012 at 15:52

1 Answer 1

2

You could use a lookahead, that asserts that there is no closing ] before the next opening [ or the end of the string:

Regex.Split(Line, @"[ ]+(?![^\[\]]*\])");

This will fail you if you have nested [...] structures though. Note that the lookahead is not part of the actual match, it just checks what follows without consuming anything. Inside the lookahead I used [^\[\]] which is a negated character class, matching any character except for any kind of square bracket.

Also note that this splits on 1 or more spaces. If you want to require at least two, replace [ ]+ with [ ]{2,} and if you want exactly two with [ ]{2}.

Further reading on lookarounds.

Sign up to request clarification or add additional context in comments.

8 Comments

its does not seem to work. how does this regex look for other patterns (<> instead of [])?
@csetzkorn it works fine for me. For input 1 2 <1 2>, use @" (?![^\<\>]*\>)". If that does not work, please let me know what exactly happens instead, and show some code if possible.
works for me as long as there is only one space between @" and (...
@garyh the example input has two spaces, which is why I used two spaces in the regex (and it works for me with two spaces). I just realised that my last comment contained only one space. That was meant to be two. But maybe it's best to allow an arbitrary number: @"[ ]+(?![^\[\]]*\])" or @"[ ]+(?![^\<\>]*\>)"
Thanks it works. You forgot a space @" (?![^\<\>]*\>)" => @" (?![^\<\>]*\>)".
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.