0

I have the following string where I want to remove everything between the following tags including those tags:

<br> and the </span>

<a class='interactive' href='http://mathbench.umd.edu/modules/microbio_counting-methods/page01.htm' target='_blank' alt='Counting bacteria' >Counting bacteria<br><span class='attribute'> - University of Maryland</span></a>

I have tried preg_replace('/<br>.*?</a>/', '', $link) but that seems to remove the href...

Any ideas how I should do this?

EDIT: After using:

 preg_replace('/<br>.*?<\/span>/', '', $link) 

I now see in the source:

 <tr>
    <td><a class='interactive' href='http://www.proteinatlas.org/' target='_blank' alt='The protein atlas' >The protein atlas<br><span class='attribute'> - Uppsala Univeristät</td> 
    <td width='16' align='center' valign='middle'><a class='delete_link' href='#' data_link='%3Ca+class%3D%27interactive%27+href%3D%27http%3A%2F%2Fwww.proteinatlas.org%2F%27+target%3D%27_blank%27+alt%3D%27The+protein+atlas%27+%3EThe+protein+atlas%3Cbr%3E%3Cspan+class%3D%27attribute%27%3E+-+Uppsala+Univerist%C3%A4t' data_topic='161' data_introduction=''><img src="../images/delete.png" width="16" height="16" alt="delete" title="delete this link" border='0' /></a></td>
  </tr>
  <tr>
    <td> funded by the Knut and Alice Wallenberg Foundation</span></a></td> 
    <td width='16' align='center' valign='middle'><a class='delete_link' href='#' data_link='+funded+by+the+Knut+and+Alice+Wallenberg+Foundation%3C%2Fspan%3E%3C%2Fa%3E' data_topic='161' data_introduction=''><img src="../images/delete.png" width="16" height="16" alt="delete" title="delete this link" border='0' /></a></td>
  </tr>

Edit: Also tried;

preg_replace('/<br><span class=\'attribute\'>.*?<\/span>/', '', $link)

but problem persists.

EDIT

Still see the source showing as:

<a class='interactive' href='http://www.tinyurl.com/immunologygame/' target='_blank' alt='Innate Immunology game' >Innate Immunology game<br><span class='attribute'> - University of Ballarat</span></a>
1
  • If there are some newlines in the subject string then try to add "s" modifier to the regex: /<br>.*?<\/span>/s Commented Feb 22, 2013 at 8:52

3 Answers 3

1

Try this :

<?php

$str = "<a class='interactive' href='http://mathbench.umd.edu/modules/microbio_counting-methods/page01.htm' target='_blank' alt='Counting bacteria' >Counting bacteria<br><span class='attribute'> - University of Maryland</span></a>";

$r = '/<br>(.+?)<\/span>/';

$str = preg_replace($r, '', $str);

echo $str;

?>

Output :

<a class='interactive' href='http://mathbench.umd.edu/modules/microbio_counting-methods/page01.htm' target='_blank' alt='Counting bacteria' >Counting bacteria</a>

Demo : http://regexr.com?33s84

Sign up to request clarification or add additional context in comments.

2 Comments

I think the group in your pattern is in his case not necessary.
@tuxtimo Well, actually it's not. I just use to have groups (either like this one, or non-capturing ones (which would be better to be honest)), in order to make a regex more visually "revealing" and explanatory (well, in a way...).
1

Try this

$str = "<a class='interactive' href='http://mathbench.umd.edu/modules/microbio_counting-methods/page01.htm' target='_blank' alt='Counting bacteria' >Counting bacteria<br><span class='attribute'> - University of Maryland</span></a>";

echo htmlspecialchars(preg_replace('#(<a[^>]+?>)([^<>]+).*#i', '$1$2</a>', $str));

Comments

0

Just use this short pattern:

/<br>.*?<\/span>/

The output will be something like this:

<a class='interactive' href='http://mathbench.umd.edu/modules/microbio_counting-methods/page01.htm' target='_blank' alt='Counting bacteria' >Counting bacteriabla</a> 

4 Comments

I would have thought that would work, but the span tag is still showing: I have edited OP with source that shows.
Actually I can see it working, but a problem arises when there is a comma in the content between the span tags...how do I escape that?
Mmm, evenafter manually removing any commas in text between the span tags I still see the br and span tags in the source...see edit in OP
The problem was having commas in the span text. I have set to strip commas and replace with dash, so this fixes the problem after tuxtimo's solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.