I am trying to parse a file name according to a given pattern but not able to perfect the match. Here is a sample file name:
CRS-ISAU-RPV#3430_Dedalus_Conc.ok.erto_AOTreviglio.doc
And here are my requirements:
til the character # the file name can contain anything, after #, i have to find character _ or the character - to separate a string. The string in between the character(optionally _ or - - but not both) can contain any other character. So eventually after the character # i must have exactly three (3) _ or - characters combined. The string should end with .doc or .docx or .odt but NOT .ok.doc or .ok.docx or .ok.odt.
Here is what i tried:
(.*)#([^_-]+)[_-]([^_-]+)[_-]([^_-]+)[_-]([^_-]+)\.[doc|odt|docx].*(?<!\.ok)$
But this forces me to end the string with .doc.ok or .docs.ok or .docx.ok and actually i want to retain the file extension at the end.
If i try this:
(.*)#([^_-]+)[_-]([^_-]+)[_-]([^_-]+)[_-]([^_-]+)\..*(?<!ok\.[doc|odt|docx])$
it wont work.
Any help would be appreciated. Thank you :)
[doc|odt|docx]doesn't do what you appear to think it does. Try replacing the[]with a non-capturing group:(?:)"^([^#]*#[^-_]*)[-_](.*)$(?:(?<=(?<!\\.ok)\\.docx$)|(?<=(?<!\\.ok)\\.doc$)|(?<=(?<!\\.ok)\\.odt$))"(?m)^([^#]*#[^-_]*)[-_](.*)$(?:(?<=(?<!\.ok)\.docx$)|(?<=(?<!\.ok)\.doc$)|(?<=(?<!\.ok)\.odt$))- or even simpler(?m)^([^#]*#[^-_]*)[-_](.*)$(?<=(?<!\.ok)\.(?:docx?|odt)$)works there.