0

I'm looking for an easy way (for the person who will be entering all the information about the commands and parameters) to match commands out of a list of commands and extract specific parameters from it. For example this two commands: SENDDR456 (where 456 is the parameter) GETmsg35 (where msg and 35 are two parameters) I thought regex is the best option. The goal again is to make the identifier\extractor scalable and allow for easily adding more commands. I'm using C#.

3
  • The lack of defined delimitation will make this extremely difficult to maintain and extend. It's a parsing problem, you can keep the commands in a list, but the parameters, well, how will you possibly know that SENDDR456 means SENDDR(456) and not SENDDR(4, 56) or SENDDR(45, 6)? You get the point, it's a crappy protocol. Commented Jul 16, 2014 at 5:52
  • Hey there, are any of the answers helping, or are you still having problems with it? If so, please give more details. :) Commented Jul 16, 2014 at 22:16
  • I found that CaptureCollection might be the best answer. I will into it and post any insights. Commented Jul 17, 2014 at 4:34

3 Answers 3

1

.NET CaptureCollection Can Tokenize

But It Depends on Whether Consecutive Parameters Can Be Well-Delimited or Well-Specified

For your example, you can use this regex:

(SENDDR|GET)(\d+|[a-z]+)+

This relies on the terrific CaptureCollection specific to .NET regex, whereby when a given capture group is quantified, all the intermediate captures are preserved in a stack and accessible.

  • Groups[1].Value contains the command
  • The capture Group 2 contains the parameters in a capture collection: Groups[2].Captures[0].Value contains the first parameter, Groups[1].Captures[1].Value contains the second parameter

But note that this relies on the parameters being well-specified or delimited. For instance, in this example, one parameter is specified by [a-z]+, the other by \d+, which are mutually exclusive.

Sign up to request clarification or add additional context in comments.

1 Comment

If you have thoughts about the parameter specs or delimiters, let me know so we can fine-tune the regex. :)
0

Assuming your command is upper case and parameter are lowercase or number, you can use (\\d+)|([a-z]+)

var matches1 = Regex.Matches("GETmsg35", "(\\d+)|([a-z]+)"); 
foreach(Match match in matches1)
   Console.WriteLine(match.Value);

To include command you can use ([A-Z]+)|([a-z]+)|(\\d+)

var matches1 = Regex.Matches("GETmsg35", "([A-Z]+)|([a-z]+)|(\\d+)");   
if(matches1.Count > 0)     
      Console.WriteLine("Command >> " + matches1[0].Value);
for(int i=1; i < matches1.Count; i++)
   Console.WriteLine("Parameters >> " + i + "\t" + matches1[i].Value);

Output

Command >> GET
Parameters >> 1  msg
Parameters >> 2  35

Comments

0

If the Information lies in strings with different styles,lengths or offsets, Regex is surely the best solution.

Here there are just two commands, SENDDR and GET that can be checked via substring Function but again the length of parameter is not the same every time so you will have to check it by digits and alphabets .In this case Regex is easy than implementing all this.

Here is the regex assuming parameters are numbers or either small letters

       (SENDDR|GET)(\d+|[a-z]+)+

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.