5

How create a REGEX to detect if a "String url" contains a file extension (.pdf,.jpeg,.asp,.cfm...) ?

Valids (without extensions):

Invalids (with extensions):

Thanks, Celso

1
  • What do you mean by contains a file extension? Do you mean ends with one of these strings? Can you give a few examples of what you don't want to match? Commented Mar 3, 2011 at 22:12

5 Answers 5

3

In Java, you are better off using String.endsWith() This is faster and easier to read. Example:

"file.jpg".endsWith(".jpg") == true
Sign up to request clarification or add additional context in comments.

Comments

3

Alternative version without regexp but using, the URI class:

import java.net.*;

class IsFile { 
  public static void main( String ... args ) throws Exception { 
    URI u = new URI( args[0] );
    for( String ext : new String[] {".png", ".pdf", ".jpg", ".html"  } ) { 
      if( u.getPath().endsWith( ext ) ) { 
        System.out.println("Yeap");
        break;
      }
    }
  }
}

Works with:

java IsFile "http://download.oracle.com/javase/6/docs/api/java/net/URI.html#getPath()"

1 Comment

This is a much better approach particularly in a filter where you already have a requestURI.
3

How about this?

// assuming the file extension is either 3 or 4 characters long
public boolean hasFileExtension(String s) {
    return s.matches("^[\\w\\d\\:\\/\\.]+\\.\\w{3,4}(\\?[\\w\\W]*)?$");
}

@Test
public void testHasFileExtension() {
    assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.pdf"));
    assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.htm"));
    assertTrue("4-character extension", hasFileExtension("http://www.yahoo.com/a.html"));
    assertTrue("3-character extension with param", hasFileExtension("http://www.yahoo.com/a.pdf?p=1"));
    assertTrue("4-character extension with param", hasFileExtension("http://www.yahoo.com/a.html?p=1&p=2"));

    assertFalse("2-character extension", hasFileExtension("http://www.yahoo.com/a.co"));
    assertFalse("2-character extension with param", hasFileExtension("http://www.yahoo.com/a.co?p=1&p=2"));
    assertFalse("no extension", hasFileExtension("http://www.yahoo.com/hello"));
    assertFalse("no extension with param", hasFileExtension("http://www.yahoo.com/hello?p=1&p=2"));
    assertFalse("no extension with param ends with .htm", hasFileExtension("http://www.yahoo.com/hello?p=1&p=a.htm"));
}

Comments

0

Not a Java developer anymore, but you could define what you're looking for with the following regex

"/\.(pdf|jpe{0,1}g|asp|docx{0,1}|xlsx{0,1}|cfm)$/i"

Not certain what the function would look like.

1 Comment

You probably want ? rather than {0,1} for simplicity.
0

If the following code returns true, then contains a file extension in the end:

urlString.matches("\\p{Graph}+\\.\\p{Alpha}{2,4}$");

Assuming that a file extension is a dot followed by 2, 3 or 4 alphabetic chars.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.