0

I am trying to extract an URL from the Body of a Mail in PowerShell.

I am using following regex: (found on this site)

$regexURL = "@^(https?|ftp)://[^\s/$.?#].[^\s]*$@iS"

then I loop into a Mail folder and for each mail-item:

foreach ($Mail in $subfolder.items)  {    
    $a = [Regex]::Match($Mail.Subject, $regexURL).Groups[1].Value
    $b = [Regex]::Match($Mail.Body, $regexURL).Groups[1].Value
}

But even when Mail.Subject or Body contains a valid URL, $a and $b stay empty.

I am afraid I did not understand how is Match() working.

Thanx for any help on that question. Jerome

4
  • Indentation! Please fix it before posting! Commented Apr 13, 2016 at 16:35
  • Please post an example of $Mail.Subject and what you expect $ato be. Commented Apr 13, 2016 at 16:42
  • 1
    What's the @ good for? Seems to be regex for another language. Try $regexURL = '(https?|ftp)://[^\s/$.?#].[^\s]*$'. iS modifiers is not required in PS/.NET AFAIK. Commented Apr 13, 2016 at 16:48
  • $Mail.Subject could be "stockoverflow.com" for instance. $Mail.Body the same with empty lines at the end. '(https?|ftp)://[^\s/$.?#].[^\s]*$' is not working at all. For the time being only "(http[s]?|[s]?ftp[s]?)(:\/\/)([^\s,]+)" is a little bit (!) working and gives "http" back. Commented Apr 13, 2016 at 17:37

1 Answer 1

2

The regex is for another language (PHP probably), so you need to modify it to the .NET syntax. Powershell is case insensitive by default and S is an optimizing modifier for PHP I think, so we'll skip both of those. Try:

$regexURL = '(https?|ftp)://[^\s/$.?#].[^\s]*$'

Sample:

$regexURL = '(https?|ftp)://[^\s/$.?#].[^\s]*$'

#URL
[regex]::Match("test http://stackoverflow.com/questions/36604481/issue-using-regexmatch-in-powershell/36605171?noredirect=1#comment60807898_36605171", $regexURL).Groups[0].Value
http://stackoverflow.com/questions/36604481/issue-using-regexmatch-in-powershell/36605171?noredirect=1#comment60807898_36605171

#Protocol
[regex]::Match("test http://stackoverflow.com/questions/36604481/issue-using-regexmatch-in-powershell/36605171?noredirect=1#comment60807898_36605171", $regexURL).Groups[1].Value
http
Sign up to request clarification or add additional context in comments.

3 Comments

Yup, .NET doesn't use modifiers, so @^ is basically the equivalent of (?!) here.
I think you found the point but it is still not working. I tried now with: "(http[s]?|[s]?ftp[s]?)(:\/\/)([^\s,]+)" and it gives back "http" in $b. I have still to find the right regexUrl for PowerShell 5.
It does exactly what it should. You have http (protocol) in a capture group(). The match itself is in group 0. See sample in updated answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.