3

I'm looking for elegant regular expression to clean brackets with content looks like file name.

[Nibh justo] elit Nulla [link.pdf]  auctor ipsum molestie (link.pdf) 
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus 
In [Curabitur] et

The result should be:

Nibh justo elit Nulla auctor ipsum molestie Condimentum euismod 
non tempus In Curabitur et

I trust there must be a short way do that. (File means simply - dot included. No sentences check is necessary.)

thank for help

2 Answers 2

3

Something like this ?

$str = '[Nibh justo] elit Nulla [link.pdf]  auctor ipsum molestie (link.pdf) 
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus In 
[Curabitur] and other [./beta/link.pfd]';

$str = preg_replace('`(\(|\[)[\w/\.-]+\.[a-z]+(\)|\])`i', '', $str);
$str = str_replace(array('[', ']'), '', $str);

echo $str;

Result is :

Nibh justo elit Nulla auctor ipsum molestie Condimentum euismod 
non tempus In Curabitur and other
Sign up to request clarification or add additional context in comments.

Comments

2

Try this:

(?:[\[\(]\w+\.\w+[\]\)])|(?:[\[\(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\]\)])

$result = preg_replace('/(?:[[(]\w+\.\w+[\])])|(?:[[(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\])])/m', '', $subject);

Explanation:

    <!--
(?:[\[\(]\w+\.\w+[\]\)])|(?:[\[\(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\]\)])

Options: ^ and $ match at line breaks

Match either the regular expression below (attempting the next alternative only if this one fails) «(?:[\[\(]\w+\.\w+[\]\)])»
   Match the regular expression below «(?:[\[\(]\w+\.\w+[\]\)])»
      Match a single character present in the list below «[\[\(]»
         A [ character «\[»
         A ( character «\(»
      Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match the character “.” literally «\.»
      Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match a single character present in the list below «[\]\)]»
         A ] character «\]»
         A ) character «\)»
Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(?:[\[\(](?=[0-9A-Za-z]))»
   Match the regular expression below «(?:[\[\(](?=[0-9A-Za-z]))»
      Match a single character present in the list below «[\[\(]»
         A [ character «\[»
         A ( character «\(»
      Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=[0-9A-Za-z])»
         Match a single character present in the list below «[0-9A-Za-z]»
            A character in the range between “0” and “9” «0-9»
            A character in the range between “A” and “Z” «A-Z»
            A character in the range between “a” and “z” «a-z»
Or match regular expression number 3 below (the entire match attempt fails if this one fails to match) «(?:(?<=[0-9A-Za-z])[\]\)])»
   Match the regular expression below «(?:(?<=[0-9A-Za-z])[\]\)])»
      Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=[0-9A-Za-z])»
         Match a single character present in the list below «[0-9A-Za-z]»
            A character in the range between “0” and “9” «0-9»
            A character in the range between “A” and “Z” «A-Z»
            A character in the range between “a” and “z” «a-z»
      Match a single character present in the list below «[\]\)]»
         A ] character «\]»
         A ) character «\)»
-->

when the above RegEx applied to :

[Nibh justo] elit Nulla [link.pdf]  auctor ipsum molestie (link.pdf) 
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus 
In [Curabitur] et

produces required result:

Nibh justo elit Nulla   auctor ipsum molestie  
Condimentum euismod non   tempus 
In Curabitur et

4 Comments

yes. works perfect, but Curabitur still lost. Looks like it doesn't match single word in brackets.
Try this: (?:[\[\(]\w+\.\w+[\]\)])|(?:[\[\(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\]\)])
Thank you for help. There is a lot of mismatches while the file links included dirs, but it helped while creating expr.
If you need to match filenames with or without dirs, then the RegEx could be more simpler than that was being posted now!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.