0

I have a C# application, which I'm using RegEx to run an expect from a Unix response. I currently have this.

//will pick up :
//  What is your name?:
//  [root@localhost ~]#
//  [root@localhost ~]$
//  Do you want to continue [y/N]
//  Do you want to continue [Y/n]
const string Command_Prompt_Only = @"[$#]|\[.*@(.*?)\][$%#]";
const string Command_Question_Only = @".*\?:|.*\[y/N\]/g";
const string Command_Prompt_Question = Command_Question_Only + "|" + Command_Prompt_Only;

This works as I've tested it with www.regexpal.com, but I think I need some optimization as there are times, it seems to slow way down when I use Command_Prompt_Question.

var promptRegex = new Regex(Command_Prompt_Question);
var output = _shellStream.Expect(promptRegex, timeOut);

I might want to mention I'm using SSH.NET to talk to these Linux servers, but I don't think it's a SSH.NET issue because when I use Command_Prompt_Only it's fast.

Does anyone see any issues with the const string I'm using? Is there a better way to do it?

My project is open source if you feel like you want to go play with it.
https://github.com/gavin1970/Linux-Commander

Code in question: https://github.com/gavin1970/Linux-Commander/blob/master/Linux-Commander/common/Ssh.cs

It's call Linux Commander and I'm attempting to build a virtual linux console with Ansible support.

6
  • 1
    Have you tried using a cached (static readonly Regex) with RegexOptions.Compiled? I note that your regex is using ECMAScript syntax /g which is not supported by .NET - you also aren't putting each sub-expression in a non-capturing group - is that intentional? Commented Sep 22, 2020 at 22:16
  • Wasn't aware that /g wasn't supported. Thanks for that info. So, are you saying by grouping it will make it faster? ([$#])|([.*@(.*?)][$%#])|(.*\?:)|(.*[y/N])|(.*[Y/n]) Commented Sep 22, 2020 at 23:15
  • btw, static readonly should be used for public, where const is more for private use and would be faster for the reads. This variable isn't something that should change, once I get it set correctly. Commented Sep 22, 2020 at 23:19
  • That is incorrect: you cannot use const with reference-types like Regex, you can only use static readonly. Saying "const is faster" is an oversimplification: const and static readonly have different semantics and the C# compiler will inline const values even across assembly boundaries (which can be a source of bugs if you don't rebuild when updating assembly dependencies). Commented Sep 22, 2020 at 23:19
  • Using /g in ECMAScript regex is the same as enumerating all Match values in the MatchCollection - however your code doesn't seem to evaluate the regex by itself, instead it's done by the .Expect method. Where is that defined and what does it do? Commented Sep 22, 2020 at 23:20

2 Answers 2

1

Does anyone see any issues with the const string I'm using?

Yes too much backtracking in those patterns.

If one knows that there is at least one item, specifying a * (zero or more) can cause the parser to look over many zero type assertions. Its better to prefer the+ (one or more) multiplier which can shave a lot of time off of researching dead ends in backtracking.


This is interesting \[.*@(.*?)\] why not use the negative set ([^ ]) pattern instead such as this change:

\[[^@]+@[^\]+\]

Which says anchor off of a literal "[" and the find 1 or more items that are not a literal "@" ([^@]+) and then find 1 or more items that are not a literal "]" by [^\]+.

Sign up to request clarification or add additional context in comments.

2 Comments

I have in the post above, what I'm attempting to capture. There are many scenarios and I must pass them all in with one call to SSH.NET. @ is part of the string I'm looking for. [root@localhost ~]# is just one scenario.
I have taking your suggestion about * vs +.
-1

Try this:

class Foo
{
    const string Command_Prompt_Only     = @"[$#]|\[.*@(.*?)\][$%#]";
    const string Command_Question_Only   = @".*\?:|.*\[y/N\]";

    const string Command_Prompt_Question = "(?:" + Command_Question_Only + ")|(?:" + Command_Prompt_Only + ")";

    private static readonly Regex _promptRegex = new Regex( Command_Prompt_Question, RegexOptions.Compiled );

    public void Foo()
    {
        // ...

        var output = _shellStream.Expect( _promptRegex, timeOut );
    }
}

3 Comments

/g has no meaning in .net, it matches a literal slash followed by 'g'.
@PoulBak I left that in my answer because at the time I didn't know if the OP wanted it there or not.
You should also use non-greedy quantifiers, that will speed up.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.