0

Given a URL path, I am attempting to create a regular expression to match if the path points to a directory, and to not match if it points to file extensions .php, .js, .html, .htm, etc.

Match:

/www/site/path/ny/

Match:

/www/news/state.208/ii

Do not match:

/www/news/index.php

Do not match:

/www/site/fr/post.py

Here is my regex that I've been working with:

^([a-zA-Z0-9/_\-]*[^/])$

This regex does what I want, but it isn't precise enough. It assumes it is a path to a file and stops matching whenever it comes across a ".", but I need to consider the possibility that the directory name contains a ".", like in the example above.

I also tried using negative look behinds with no luck:

^(.(?!\.php)|(?!\.htm?l)|(?!\.js))$
3
  • What do you mean by a period ? Commented Dec 7, 2012 at 12:36
  • A period is a 'full stop' or '.' Commented Dec 7, 2012 at 12:36
  • 1
    you haven't specified what you're using this for, but PHP has lots of built in functions in this area -- php's file_exists(), is_dir(), pathinfo(), basename(), and other functions may be more appropriate than complex regex here. Commented Dec 7, 2012 at 12:46

3 Answers 3

2

try this tiny one:

^.*/[^\.]*$

edit:

^.*/((?!\.js|\.php|\.html).)*$

replace or extend with the extensions you want to ignore.

Sign up to request clarification or add additional context in comments.

2 Comments

This almost works, except it wouldn't match the directory: "/www/ww.news4".
@deraad: see edit then. as you tell you want "file extensions" to be ignored, the only way to achieve it with a regex is to define a list of unwanted extensions.
2

Since you're using PHP, you might want to try pathinfo instead:

$parts = pathinfo('/www/site/path/file.php');

In this case $parts['extension'] will exist, in other cases it will not.

3 Comments

Good idea, except I need this to be a regex for a variety of reasons
@deraad - it might help you with getting helpful answers if you explain those 'variety of reasons'. From what you've said so far, regex doesn't sound like the appropriate solution.
@SDC I need it in regex form because it will go not only in a PHP function, but in an htaccess file as well. If I were only using this in my PHP script, I would use the PHP functions like pathinfo()
1

To simplify your question, you are looking for a string starting with /, and the last segment doesn't have a extension:

[\/]{0,1}.*\/[^\.]*.{1}$

Edit:

This works for all your examples:

^(.*\/){0,1}[^\.]*.{1}$

2 Comments

It doesn't necessarily HAVE to start with a '/'. It may or may not. To simplify it should look for strings containing: a-zA-Z0-9/_\.- and does not end in a file extension.
Thanks for editing that, but it appears to still be matching "www/fr/this.php", I'm trying to get it to not match when there is a file extension at the end.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.