5

I'm trying to split the following string "Name=='mynme' && CurrentTime<'2012-04-20 19:45:45'" into this:

Name
==
'myname'
&&
CurrentTime
<
'2012-04-20 19:45:45'

I have the following regex:

([+\\-*/%()]{1}|[=<>!]{1,2}|[&|]{2})

The problem is when using the above regex I get the following result:

Name
== 
'myname'
&&
CurrentTime 
<
'2012
-
04
-
20
19:45:45'

I practically need the regex to be quote aware.

Thanks

Update 1 regarding lordcheeto's answer:

Your response is close. But the following is still not split correctly:

 string input2 = "((1==2) && 2-1==1) || 3+1==4 && Name=='Stefan+123'";

What I need to do is to split a string into operators and operands. Something like this:

 LeftOperand Operator RightOperand

Now, if any operator is between '' it should be ignored and the whole string between '' should be treated as an operand.

The string above should generate the following output:

(

(
1
==
2
)

&&
2
-
1
==
1
)

||
3
+
1
==
4
&&
Name
==
'Stefan+123'
1
  • Just edited it, should work now. Commented Apr 23, 2012 at 23:07

3 Answers 3

4

Ok, assuming you want it to simply split on logical and relational operators, you can use this pattern:

string lordcheeto = @"\s*(==|&&|<=|>=|<|>)\s*";    

This will also trim all whitespace from the returned strings.

Code:

using System;
using System.Text.RegularExpressions;

namespace RegEx
{
    class Program
    {
        static void Main(string[] args)
        {
            string original = "([+\\-*/%()]{1}|[=<>!]{1,2}|[&|]{2})";
            string lordcheeto = @"\s*(==|&&|<=|>=|<|>)\s*";

            string input = "Name=='mynme' && CurrentTime<45 - 4";
            string input1 = "Name=='mynme' && CurrentTime<'2012-04-20 19:45:45'";
            string ridiculous = "Name == BLAH && !@#>=$%^&*()< ASDF &&    this          >          that";

            executePattern("original", input, original);
            executePattern("lordcheeto's", input, lordcheeto);
            executePattern("original", input1, original);
            executePattern("lordcheeto's", input1, lordcheeto);
            executePattern("original", ridiculous, original);
            executePattern("lordcheeto's", ridiculous, lordcheeto);
        }

        static void executePattern(string version, string input, string pattern)
        {
            // Avoiding repitition for this example.
            Console.WriteLine("Using {0} pattern:", version);

            // Needs to be trimmed.
            var result = Regex.Split(input.Trim(), pattern);

            // Pipes included to highlight whitespace trimming.
            foreach (var m in result)
                Console.WriteLine("|{0}|", m);

            // Extra space.
            Console.WriteLine();
            Console.WriteLine();
        }
    }
}

Test:

https://ideone.com/DaRtP

Output:

Using original pattern:
|Name|
|==|
|'mynme' |
|&&|
| CurrentTime|
|<|
|45 |
|-|
| 4|


Using lordcheeto's pattern:
|Name|
|==|
|'mynme'|
|&&|
|CurrentTime|
|<|
|45 - 4|


Using original pattern:
|Name|
|==|
|'mynme' |
|&&|
| CurrentTime|
|<|
|'2012|
|-|
|04|
|-|
|20 19:45:45'|


Using lordcheeto's pattern:
|Name|
|==|
|'mynme'|
|&&|
|CurrentTime|
|<|
|'2012-04-20 19:45:45'|


Using original pattern:
|Name |
|==|
| BLAH |
|&&|
| |
|!|
|@#|
|>=|
|$|
|%|
|^&|
|*|
||
|(|
||
|)|
||
|<|
| ASDF |
|&&|
|    this          |
|>|
|          that|


Using lordcheeto's pattern:
|Name|
|==|
|BLAH|
|&&|
|!@#|
|>=|
|$%^&*()|
|<|
|ASDF|
|&&|
|this|
|>|
|that|

Edit

Ok, with the additional constraints, you should be able to use this:

string lordcheeto = @"\s*('.*?'|&&|==|<=|>=|<|>|\(|\)|\+|-|\|\|)\s*";

This will still trim all whitespace from the returned strings. It will, however, return empty strings if matches are right next to each other (e.g. Name=='Stefan+123'). I was unable to work around that this time, but it's not so important.

If you import System.Linq and System.Collections.Generic and make the results a List<string>, you can remove all empty strings from the List in one extra line like this (which is slower than using straight-up for loops):

var results = Regex.Split(input.Trim(), pattern).ToList();
results.RemoveAll(x => x == "");

Code:

using System;
using System.Text.RegularExpressions;

namespace RegEx
{
    class Program
    {
        static void Main(string[] args)
        {
            string lordcheeto = @"\s*('.*?'|&&|==|<=|>=|<|>|\(|\)|\+|-|\|\|)\s*";

            string input = "Name=='mynme' && CurrentTime<45 - 4";
            string input1 = "Name=='mynme' && CurrentTime<'2012-04-20 19:45:45'";
            string input2 = "((1==2) && 2-1==1) || 3+1==4 && Name=='Stefan+123'";

            executePattern("lordcheeto's", input, lordcheeto);
            executePattern("lordcheeto's", input1, lordcheeto);
            executePattern("lordcheeto's", input2, lordcheeto);

            Console.ReadLine();
        }

        static void executePattern(string version, string input, string pattern)
        {
            // Avoiding repitition for this example.
            Console.WriteLine("Using {0} pattern:", version);

            // Needs to be trimmed.
            var result = Regex.Split(input.Trim(), pattern);

            // Pipe included to highlight empty strings.
            foreach (var m in result)
                Console.WriteLine("|{0}", m);

            // Extra space.
            Console.WriteLine();
            Console.WriteLine();
        }
    }
}

Test:

https://ideone.com/r5lpE

Output:

Using lordcheeto's pattern:
|Name
|==
|
|'mynme'
|
|&&
|CurrentTime
|<
|45
|-
|4


Using lordcheeto's pattern:
|Name
|==
|
|'mynme'
|
|&&
|CurrentTime
|<
|
|'2012-04-20 19:45:45'
|


Using lordcheeto's pattern:
|
|(
|
|(
|1
|==
|2
|)
|
|&&
|2
|-
|1
|==
|1
|)
|
|||
|3
|+
|1
|==
|4
|&&
|Name
|==
|
|'Stefan+123'
|

Additional Comments:

If you want to split on any other operators (e.g., <<, +=,=, -=, >>) as well (there's a lot), or need anything else, just ask.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the answer. PLease see my Update on the original post.
1

Thanks "lordcheeto" for the answer I was able to solve similar problem with your solution. I am sharing my problem and solution just in case of helping anyone with similar problem.

I have to split string like

"abc < 1 && 124 > 2 || 1243 <= 555";

First Into

abc < 1
&&
124 > 2
||
1243 <= 555

I have achieve this by using

string[] condtions = Regex.Split(str, @"\s*('.*?'|&&|\|\|)\s*");

Then I have to split each condition like

abc < 1 

Into

abc
<
1

I achieved this by using

string[] statements = Regex.Split(condtions[0], @"\s*('.*?'|==|<=|>=|<|>|!=)\s*");

Comments

0

I came up with a solution, using RegEx. Some stuff in here that might interest some.

^(?<variable1>\p{Po}?\w+\p{Po}?)\s*(?<logical_operator1>[=|+&<%*()-]{1,2})\s*(?<value1>\p{Po}?\w+\p{Po}?)\s*(?<logical_operator2>[=|+&<%*()-]{1,2})\s*(?<variable2part1>\w+)(?<variable2part2><)(?<variable2part3>\p{Po}?\d{4}-\d{2}-\d{2}\s*\d{2}:\d{2}:\d{2}\p{Po}?)

for

Name=='mynme' && CurrentTime<'2012-04-20 19:45:45'

with this substitution

$1\n$2\n$3\n$4\n$5\n$6\n$7

returns

Name
==
'mynme'
&&
CurrentTime
<
'2012-04-20 19:45:45'

Explanation:

for \p{<category>} see: https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#unicode-category-or-unicode-block-p

^
(?<variable1>           # Group: variable1
    \p{Po}?             # Optional punctuation (Po = Other Punctuation)  -- I would use \p{Pi} and \p{Pf} if possible but `'` is considered an "other punctuation"
    \w+                 # One or more word characters
    \p{Po}?             # Optional punctuation
)
\s*
(?<logical_operator1>    # Group: logical_operator1
    [=|+&<%*()-]{1,2}    # One or two of these symbols
)
\s*
(?<value1>              # Group: value1
    \p{Po}?             # Optional punctuation
    \w+                 # Word characters
    \p{Po}?             # Optional punctuation
)
\s*
(?<logical_operator2>   # Group: logical_operator2
    [=|+&<%*()-]{1,2}   # Again, one or two logical/operator symbols
)
\s*
(?<variable2part1>\w+)   # Group: variable2part1 — word characters
(?<variabe2part2><)      # Group: typo in name (should be variable2part2) — literal <
(?<variable2part3>       # Group: variable2part3 — a datetime
    \p{Po}?              # Optional punctuation
    \d{4}-\d{2}-\d{2}    # Date: yyyy-mm-dd
    \s*
    \d{2}:\d{2}:\d{2}    # Time: hh:mm:ss
    \p{Po}?              # Optional punctuation
)

Assumptions: variable2part3 will always be of format ####-##-## ##:##:##

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.