I think you'll find too many edge cases when trying to pull this off with regular expressions. Dealing with the quotes is what really complicates things, not to mention escape characters.
A procedural solution is not complicated, and will be faster and easier to modify as needs dictate. Note that I don't know what the escape characters should be in your example, but you could certainly add that to the algorithm...
string CodeSnippet = Resource1.CodeSnippet;
StringBuilder CleanCodeSnippet = new StringBuilder();
bool InsideQuotes = false;
bool InsideComment = false;
Console.WriteLine("BEFORE");
Console.WriteLine(CodeSnippet);
Console.WriteLine("");
for (int i = 0; i < CodeSnippet.Length; i++)
{
switch(CodeSnippet[i])
{
case '"' :
if (!InsideComment) InsideQuotes = !InsideQuotes;
break;
case '#' :
if (!InsideQuotes) InsideComment = true;
break;
case '\n' :
InsideComment = false;
break;
}
if (!InsideComment)
{
CleanCodeSnippet.Append(CodeSnippet[i]);
}
}
Console.WriteLine("AFTER");
Console.WriteLine(CleanCodeSnippet.ToString());
Console.WriteLine("");
This example strips the comments away from the CodeSnippet. I assumed that's what you were after.
Here's the output:
BEFORE
"\#" TEST #comment hello world
"ab" TEST #comment hello world
"ab" TEST #comment "hello world
"ab" + "ca" + TEST #comment
"\#" TEST
"ab" TEST
AFTER
"\#" TEST
"ab" TEST
"ab" TEST
"ab" + "ca" + TEST
"\#" TEST
"ab" TEST
As I said, you'll probably need to add escape characters to the algorithm. But this is a good starting point.
"\#" TEST #" TEST #comment hello world- presumably, the comment starts at the second#- but how would you distinguish that?"\#" TESTYou really need something that's able to determine if you're inside a pair of quotes. This may be possible with balanced matching, but it's gonna get really complex.@"...", Python'sr"""..."""or PHP's'...') or comments (e.g./*...*/), then you'll need to parse the whole document starting from the beginning to do it right.