How to remove HTML tag if it contains specific string [closed]

Question

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Guide the asker to update the question so it focuses on a single, specific problem. Narrowing the question will help others answer the question concisely. You may edit the question if you feel you can improve it yourself. If edited, the question will be reviewed and might be reopened.

Closed 8 years ago.

Improve this question

    <tr>
        <td width="300" bgcolor="#cccccc" style="text-align: right;">
         <strong>&nbsp;&nbsp;&nbsp;Sometext<br />
         </strong>
        </td>
        <td width="125" bgcolor="#009900" style="text-align: center;">
         <strong><span style="color: rgb(255, 255, 255);">
          <span style="font-size: larger;">Pricetoreplace</span>
          </span>
         </strong>
        </td>
    </tr>

I need to remove whole <tr>....</tr> row, if it contain the "Pricetoreplace" text in it. I've tried next:

$content = preg_replace('~(<tr.*[\'"]Pricetoreplace[\'"].*tr>)~', '', $content);

But it didnt work.

What do you mean "it didn't work"? Was there an error? Did it not delete anything? — kchason
– kchason, Commented Nov 15, 2017 at 14:20
You should never parse HTML with regex. Use a PHP DOM parser instead. — Jay Blanchard
– Jay Blanchard, Commented Nov 15, 2017 at 14:21
@gtktuf first off, you're going to replace everything from the first instance to the last tr> so your regex is not going to do what you expect (you use greedy quantifiers .* instead of lazy quantifiers .*?). Second, your . doesn't match new line characters, you should use [\s\S] instead or turn on the s flag to match newline characters with the . character. Again, though, you shouldn't even be using regex for this. — ctwheels
– ctwheels, Commented Nov 15, 2017 at 14:24
@gtktuf you really should be using something like this question does. — ctwheels
– ctwheels, Commented Nov 15, 2017 at 14:26
@gtktuf yes. It's usually bad practice to parse HTML or XML with regex. Regex should only be used for parsing HTML or XML if it's a known subset. In your case it doesn't appear to be so. I would recommend you use an HTML/XML parser and have it do the heavy lifting for you. — ctwheels
– ctwheels, Commented Nov 15, 2017 at 14:29

Jan · Accepted Answer · 2017-11-15 16:14:37Z

4

One way would be to use an xpath query:

*//td[contains(., 'Pricetoreplace')]/parent::tr

Here, we look for a td which text() property contains Pricetoreplace and then look up the corresponding parent tr. The latter will be removed from the DOM.

In PHP:

<?php

$html = <<<DATA
    <tr><td class="some other class">some text here</td></tr>
   <tr>
        <td width="300" bgcolor="#cccccc" style="text-align: right;">
         <strong>&nbsp;&nbsp;&nbsp;Sometext<br />
         </strong>
        </td>
        <td width="125" bgcolor="#009900" style="text-align: center;">
         <strong><span style="color: rgb(255, 255, 255);">
          <span style="font-size: larger;">Pricetoreplace</span>
          </span>
         </strong>
        </td>
    </tr>
DATA;

# set up the DOM
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);

# set up the xpath
$xpath = new DOMXPath($dom);

foreach ($xpath->query("*//td[contains(., 'Pricetoreplace')]/parent::tr") as $row) {
    $row->parentNode->removeChild($row);
}
echo $dom->saveHTML();
?>

This yields

<tr><td class="some other class">some text here</td></tr>

answered Nov 15, 2017 at 16:14

Jan

43.3k11 gold badges57 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

gtktuf Over a year ago

That's the answer, but in my case i need to replace: $dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED); $dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8')); to solve some problems with encoding. And there's no classes like: class="some other class" in the whole posts, wich i need to rebuild with this php script-that was the main problem. Ty for this method.

Jan Over a year ago

@gtktuf: Glad to help.

Collectives™ on Stack Overflow

How to remove HTML tag if it contains specific string [closed]

1 Answer 1

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Linked

Related