1

I'd like to make sure URLs such as javascript:alert('a'); and vbscript varients etc. are not allowed by whitelisting https?|ftp That's easy enough: ^(?:https?|ftp):// But how can I allow relative urls as well? such as ../../../blah and ./blah also /images/img.png

In other words is using ^(?:(?:https?|ftp)://|[./]) safe?

I've asked around and a possible solution might be: parse_url

if !scheme or scheme == http or scheme == https or scheme == ftp or scheme == mailto

2
  • What exactly are you trying to do, what are you trying to protect against? Commented May 2, 2011 at 10:23
  • XSS, I'm trying to filter URLs such to go into <a href="" or <img src="" Commented May 2, 2011 at 10:26

2 Answers 2

1

Instead of using regular expressions you could use parse_url and check that scheme is either empty or one of http, https, and ftp:

$components = parse_url($url);
if (!isset($url['scheme']) || in_array(strtolower($url['scheme']), array('http', 'https', 'ftp'))) {
    // valid
} else {
    // invalid
}
Sign up to request clarification or add additional context in comments.

2 Comments

Yep I added that to my question 5 mins ago. :)
@John: I hadn’t noticed that. :)
1

Also see: Sanitizing strings to make them URL and filename safe?

I'm trying to filter URLs such to go into <a href="" or <img src=""

Be careful, because it's possible to "break out" of the attribute with just a "starts with" regular expression. For instance, I could provide http://safeurl.com" onclick="alert('xss attack'), and when inserted into your attribute you would have:

<a href="http://safeurl.com" onclick="alert('xss attack')">

Make sure to urlencode() the value as well as any other security you're doing.

I would probably consider against allowing ../../relative/urls or perhaps using parse_url as Gumbo has suggested.

Check out the info on OWASP.org for some more advice.

2 Comments

Thanks, I was going to use hrtmlentities to properly escape the data before putting it in the attributes. I'm not sure urlencode() should be used?
I would use urlencode() on each slashed segment to be safe. This is definitely worth writing a function for, something really solid and paranoid.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.