1

I would like to validate an input.

I want the input to be either blank, or allow for entries surrounded by single quotes separated by a comma, allowing for one, two, or more entries with any characters and any length between the single quotes. ex: 'one','two','three','\four','.dog','cat'

If I use this:

$string = Read-Host "Enter a string"
$pattern = "^$|^'\w+'(,'(\w+)')*$"
if ($string -match $pattern) {
    Write-Host "The string meets the pattern."
} else {
    Write-Host "The string DOES NOT meet the pattern."
}

For empty entry this works fine: ^$|

^'\w+'(,'(\w+)')*$ works for characters a-z,A-Z,0-9. But how do I allow for any character entry, including backslashes, periods and other special characters?

Thank you.

2 Answers 2

3

Pls use $pattern = "^$|^'[^']*'(,'[^']*')*$".

[^']* matches any character except the single quote ('). It allows for any character between the single quotes, including backslashes, periods, and other special characters.

  • ^$|^ => either an empty string or the following pattern.
  • '[^']*' => ': matches the opening single quote. ^: negates the character class, *: means zero or more occurrences. So, [^']* matches any character except the single quote ('). ': matches the closing single quote.
  • (,'[^']*')* => for a comma-separated list of single-quoted strings (zero or more times).
  • *$ => for the entire pattern to repeat zero or more times.

Sample Code:

$string = Read-Host "Enter a string"
$pattern = "^$|^'[^']*'(,'[^']*')*$"
if ($string -match $pattern) {
    Write-Host "The string meets the pattern."
} else {
    Write-Host "The string DOES NOT meet the pattern."
}

Sample Outputs:

Enter a string: 'one','two','three','\four','.dog','cat'
The string meets the pattern.

Enter a string:
The string meets the pattern.

Enter a string: '.dog','\cat'
The string meets the pattern.
Sign up to request clarification or add additional context in comments.

2 Comments

beat me to it with the same exact pattern haha, tho I think '[^']*' should be '[^']+' unless OP wants to allow '' as input. Might need feedback on his side on that.
@SantiagoSquarzon - Good point, thank you. Yes, that is a fair consideration. It is best to not allow for '' as input.
1

To answer the question implied by the post's title:

  • It is the . regex metacharacter that represents any character - albeit by default excluding \n, i.e. a Unix-format newline, technically a LF character (LINE FEED, U+000A).

    • To make . also match the latter, enable the single-line regex option: place (?s) at the start of the regex (in the simplest case, to apply to the entire expression).
  • Using a negated character class, such as [^'] to match anything but ', implicitly matches a newline too, even without setting the single-line option.

    • Independently of whether . matches newlines or not, a run of characters - .* (if possibly empty) or .+ (if xnon-empty) - can be qualified with ? to make matching non-greedy, so that, in the (single-line) case at hand, .+? is an alternative to [^']+, i.e. matches only up to (but excluding) the closest ' char., if any.

As for a specific solution, to provide an alternative to Ömer Sezer's helpful answer:

  • The following regex uses a single capture group ((...)) to capture all tokens in the input string:

  • In combination with using [regex]::Match() in lieu of -match, this allows you to extract the individual tokens represented in the input string, as shown next:

$string = Read-Host 'Enter a string'

$match = [regex]::Match($string, "^(?:'([^']+)',?)*$")
if ($match.Success) {

  Write-Verbose -Verbose ('The string matches, ' + ('but is empty.', 'with the following tokens:')[[bool] $match.Value])
  $match.Groups[1].Captures.Value

} else {

  Write-Warning "The string does NOT match the pattern."

}

Note:

  • Accessing .Groups[1] on the [Match] instance returned by [regex]::Match() returns what the first (and only) capture group captured:

    • .Captures accesses the collection of the multiple instances of the capture-group matches, themselves expressed as [Match] instances.

    • .Value uses member-access enumeration to extract the matched text from each instance.

  • While the use of "..." is convenient here - because it allows use of ' without escaping - it is generally preferable to use '...' to represent regexes, i.e. verbatim string literals - see this answer.

Sample output, with 'foo','bar','baz' provided as verbatim input to the Read-Host command:

VERBOSE: The string matched, with the following tokens:
foo
bar
baz

2 Comments

Great answer as well, thank you for taking time to explain!
Glad to hear you think so, @HTWingNut; my pleasure.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.