0

I have the following function:

public static string ReturnEmailAddresses(string input)
    {

        string regex1 = @"\[url=";
        string regex2 = @"mailto:([^\?]*)";
        string regex3 = @".*?";
        string regex4 = @"\[\/url\]";

        Regex r = new Regex(regex1 + regex2 + regex3 + regex4, RegexOptions.IgnoreCase | RegexOptions.Multiline);
        MatchCollection m = r.Matches(input);
        if (m.Count > 0)
        {
            StringBuilder sb = new StringBuilder();
            int i = 0;
            foreach (var match in m)
            {
                if (i > 0)
                    sb.Append(Environment.NewLine);
                string shtml = match.ToString();
                var innerString = shtml.Substring(shtml.IndexOf("]") + 1, shtml.IndexOf("[/url]") - shtml.IndexOf("]") - 1);
                sb.Append(innerString); //just titles                    
                i++;
            }

            return sb.ToString();
        }

        return string.Empty;
    }

As you can see I define a url in the "markdown" format:

[url = http://sample.com]sample.com[/url]

In the same way, emails are written in that format too:

[url=mailto:[email protected]][email protected][/url]

However when i pass in a multiline string, with multiple email addresses, it only returns the first email only. I would like it to have multple matches, but I cannot seem to get that working?

For example

[url=mailto:[email protected]][email protected][/url] /r/n a whole bunch of text here /r/n more stuff here [url=mailto:[email protected]][email protected][/url]

This will only return the first email above?

1
  • The "Multiline" Regex option is for when you want to use ^ and $ to match the beginning and end of a line rather than the beginning and end of the whole string. If you aren't using those tokens, that option is meaningless. Commented Jan 30, 2017 at 0:24

2 Answers 2

2

The mailto:([^\?]*) part of your pattern is matching everything in your input string. You need to add the closing bracket ] to the inside of your excluded characters to restrict that portion from overflowing outside of the "mailto" section and into the text within the "url" tags:

\[url=mailto:([^\?\]]*).*?\[\/url\]

See this link for an example: https://regex101.com/r/zcgeW8/1

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @Abion47. That worked well for me. I can see how it now was matching everything
0

You can extract desired result with help of positive lookahead and positive lookbehind. See http://www.rexegg.com/regex-lookarounds.html

Try regex: (?<=\[url=mailto:).*?(?=\])

Above regex will capture two email addresses from sample string

[url=mailto:[email protected]][email protected][/url] /r/n a whole bunch of text here /r/n more stuff here [url=mailto:[email protected]][email protected][/url]

Result:

[email protected]
[email protected]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.