1

Regular expression to replace text between XML tags using Notepad++

I need to replace all URL tag if containe /mailto/ with "" in Notpad++

I need to find and replace text between html tags. Example:

xml:

 <urlset>
    <url>
        <loc>http://www.file3.ir/last-books/183-Education/The-Praeger-Handbook-of-Learning-and-the-Brain-%5B2-volumes%5D</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/last-books/183-Education/Learning-JavaScript</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/last-books/Category/183-Education/?start=868</loc>
        <lastmod>1970-01-01T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=4ae20e207319a25c17f554db7a4e9fa6f2694865</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=034b3c240db5c92e676bdf91b7b4bdffd725c428</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=c704ffd88b576782f9135da4848ab22a3cfb0f53</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=6b28532ecf1950b9e755938e65c1a1b6e466483e</loc>
        <lastmod>1970-01-01T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=a0ed06a8c2e075dbf1c1a23ae9203b2b464b166c</loc>
        <lastmod>1970-01-01T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=63e8e699f6db77096398c390d78fa1cd1ee34b6c</loc>
        <lastmod>1970-01-01T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=3a52445764cd71cad0a389f13b53faf5ae3a7dc5</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=9ea60360f4a81636f1e13f3a4e734016317d6179</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/component/mailto/?tmpl=component&amp;template=jm_plus&amp;link=42c96ca4e0c746bd9155234a619be95300714953</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
<url>
    <loc>http://www.file3.ir/last-books/183-Education/Technical-Math-Demystified</loc>
    <lastmod>2015-05-02T21:15:06+00:00</lastmod>
    <priority>0.00</priority>
    <changefreq>monthly</changefreq>
</url>
</urlset>

after replace:

    <url>
        <loc>http://www.file3.ir/last-books/183-Education/The-Praeger-Handbook-of-Learning-and-the-Brain-%5B2-volumes%5D</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/last-books/183-Education/Learning-JavaScript</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
    <url>
        <loc>http://www.file3.ir/last-books/Category/183-Education/?start=868</loc>
        <lastmod>1970-01-01T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>
<url>
        <loc>http://www.file3.ir/last-books/183-Education/Technical-Math-Demystified</loc>
        <lastmod>2015-05-02T21:15:06+00:00</lastmod>
        <priority>0.00</priority>
        <changefreq>monthly</changefreq>
    </url>

</urlset>

help me...

1 Answer 1

2

You could use the following:

  • Find what: <url>[^<]+?<loc>[^<]+?/mailto/[^<]+?</loc>.+?</url>
  • Search Mode: Regular Expression. Make sure you check . matches newline.

This should remove the chunks you are after. You can then do the following: Edit -> Line Operations -> Remove Empty Lines (Containing Blank Characters) to clear your input.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.