2

So far I have

messageText1 = Regex.Replace(messageText1, "(www|http|https)*?(com|.co.uk|.org)", "[URL OMITTED]");

With only the www, and without the bracks or http or https it works as intended

For example and input of Hey check out this site, www.google.com, it's really cool would output hey check out this site, [URL OMITTED], it's really cool

But if I put back in the or operators for the start of the URL, it only replaces the .com part of the input

Why won't it work?

Thanks

0

4 Answers 4

3
(www|http|https)*?(com|.co.uk|.org)

means www or http or https 0 to many times immediately followed by com .co.uk or .org. So it would match for example httphttphttp.co.uk

Your intention was probably just to have a . before the *. Which then means it only looks for (www|http|https) once, then it matchs . (any character) 0 to many times.

You are also missing the . in .com. However, if you want to match a literal . you need to use \., since a . on its own means 'any character'.

With that in mind, the regex I think you were going for is:

(www|http|https).*?(\.com|\.co\.uk|\.org)
Sign up to request clarification or add additional context in comments.

2 Comments

Great, it works. Will accept the answer in 6 minutes thanks!
Fantastic, no problem
1

This should work better. It will also work for other TLDs that don't end with .com, .co.uk or .org:

messageText1 = Regex.Replace(messageText1, @"\b(?:http://|https://|www\.)\S+", "[URL OMITTED]");

2 Comments

\b is not required. \S+ will always stop on a \b point.
Thanks updated this and so that it works with https
1

Your expression is missing a . somewhere or (possibly better) a \S+

 (www|http|https)\S*(com|\.co\.uk|\.org)

In C#:

 Regex.Replace(messageText1, @"(www|http|https)\S*(com|\.co\.uk|\.org)", "[URL OMITTED]");

Note: you probably want to escape the .'s as well.

2 Comments

With that it's telling me \S is an unrecognized escape sequence
@ÁppleAssassin11 Probably need to escape the backslash since it's within the string: "(www|http|https)\\S*(com|.co.uk|.org)"
0

A simple version which i tried is as follows.

messageText1 = Regex.Replace(messageText1, @"(www)?(.)?[a-z]*.(com)", "[URL OMITTED]");

i tried this with

string messageText1 = " Hey check this out, http:\www.google.com,its cool";

string messageText1 = " Hey check this out, www.google.com,its cool";

string messageText1 = " Hey check this out, google.com,its cool";

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.