0

The Regex:

https?://([a-zA-Z0-9-_]{1,50}[.])*[a-zA-z0-9-_]{1,50}[.]([(org)(gov)(com)]{3}|[(us)(fi)]{2})

The Tester:

http://regex.powertoy.org/

The Code:

if(preg_match_all('|https?://([a-zA-Z0-9-_]{1,50}[.])*[a-zA-z0-9-_]{1,50}[.]([(org)(gov)(com)]{3}|[(us)(fi)]{2})|',$row['text'],$links))
    {
        print_r($links[0]);
        /*for($x=0;$x<count(links[0]);$x++)
        {
            $row['text'] = str_replace($links[0][$x], 'link' . $link[0][$x] . 'link', $row['text'];
        }*/
    }else{
        echo 'Failure!';
    }

The regex matches URLs in the tester fine, but not at all in an HTML/PHP front end. I'm not sure what the problem is. The point of the regex/code is basically to match URLs regardless of the number of subdomains.

1
  • 3
    What does the code look like and what input is not matching when it should? Commented Aug 16, 2012 at 14:43

2 Answers 2

2

Fix of your regex pattern is:

https?:\/\/(?:[\w-]{1,50}\.)*[\w-]{1,50}\.(?:org|gov|com|us|fi)

But I recommend to use:

https?:\/\/(?:[a-zA-Z\d]+(?:\-[a-zA-Z\d]+)*\.)+(?:org|gov|com|us|fi) 
Sign up to request clarification or add additional context in comments.

Comments

2

You are using the | character as your delimiter but you are also using it in your regex.

I would recommend using another character and making the regex case-insensitive to avoid problems like where you have for example a-zA-z:

preg_match_all('#https?://([a-zA-Z0-9-_]{1,50}[.])*[a-zA-z0-9-_]{1,50}[.]([(org)(gov)(com)]{3}|[(us)(fi)]{2})#i',$row['text'],$links)

4 Comments

There should be no underscore allowed in the domain name and dash character cannot be first or last character of (sub-)domain
@Ωmega To be honest, I didn't really look at the regex itself, but mainly why it would work in the tester but not in php.
My vote up because you actually explained the problem of the pattern, though you don't need [a-zA-Z] if you use i pattern modifier
@dualed True, that was kind of implied with the case-insensitivity. I would probably use \w and put the - at the beginning or the end to avoid it turning into a range as well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.