0

Regex newbie here. I have a regular expression that matches Windows pathnames and UNC paths, terminated by '\'.

Working examples:

c:\windows\
c:\
\\server\share\
\\server\sh are\

Invalid:

c:\windows
\\server
\\server\share
\\server\ share \

However, it works as expected (at least i hope so), but it's pretty unreadable and not very performant, so any tips for optimization are greatly appreciated...

/\A(
  ([a-z]:\\(([a-zA-Z0-9äöüÄÖÜß_.$]+|[a-zA-Z0-9äöüÄÖÜß_.$]+[a-zA-Z0-9äöüÄÖÜß_.$\ ]*[a-zA-Z0-9äöüÄÖÜß_.$]+)\\)*)|
  (\\\\(([a-zA-Z0-9äöüÄÖÜß_.$]+|[a-zA-Z0-9äöüÄÖÜß_.$]+[a-zA-Z0-9äöüÄÖÜß_.$\ ]*[a-zA-Z0-9äöüÄÖÜß_.$]+)\\)+(([a-zA-Z0-9äöüÄÖÜß_.$]+|
  [a-zA-Z0-9äöüÄÖÜß_.$]+[a-zA-Z0-9äöüÄÖÜß_.$\ ]*[a-zA-Z0-9äöüÄÖÜß_.$]+)\\)+)
)\z/
2
  • You're missing out on an enormous number of valid filenames (there are lots of other characters allowed) - isn't that a problem? And you're allowing many invalid filenames (for example con.txt). What exactly are you planning to do? Checking for validity? Commented Sep 21, 2011 at 7:58
  • Sorry for the lack of information on this, i'm using Ruby 1.9 any you are right, i'm trying to check for validity... Commented Sep 21, 2011 at 9:03

1 Answer 1

4

In Ruby 1.9, the following should work:

if subject =~ 
    /\A(?:(?!.*\\(?:con|prn|aux|nul|com\d|lpt\d)\\)  # exclude invalid names
    (?:                                              # Either match        
     [a-z]:\\                                        # drive letter 
    |                                                # or
     \\\\(?:[^\\\/:*?"<>|\s]+\\){2}                  # UNC share name
    )                                                # End of alternation
    (?:                                              # Try to match:
     (?!\s)                                          # (Assert no starting space)
     [^\\\/:*?"<>|\r\n]+                             # a valid directory name
     (?<!\s)                                         # (Assert no ending space)
     \\                                              # backslash
    )*                                               # repeat as needed
    )\Z/mix
    # Successful match
else
    # Match attempt failed
end
Sign up to request clarification or add additional context in comments.

5 Comments

This allows UNC paths without share names (\\server\), but i'll get this to work and it looks way cleaner than my regzilla, thanks...
Ah, ok, shouldn't they be allowed? No problem.
Hmm, this still allows \\server\. Isn't something like (pseudocode): if unc share at least one valid directory name required needed?
You're right, sorry. Now it shouldn't do this any more. Spaces are not permitted in server and share names, is that correct?
I'm not sure about that, afair they are permitted for share names, but spaces in servernames are filled with '-', e.g. '\\ser-ver\my share\', i'll have to look this up. However, thanks again for your answer...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.