-1
    <tr>
        <td width="300" bgcolor="#cccccc" style="text-align: right;">
         <strong>&nbsp;&nbsp;&nbsp;Sometext<br />
         </strong>
        </td>
        <td width="125" bgcolor="#009900" style="text-align: center;">
         <strong><span style="color: rgb(255, 255, 255);">
          <span style="font-size: larger;">Pricetoreplace</span>
          </span>
         </strong>
        </td>
    </tr>

I need to remove whole <tr>....</tr> row, if it contain the "Pricetoreplace" text in it. I've tried next:

$content = preg_replace('~(<tr.*[\'"]Pricetoreplace[\'"].*tr>)~', '', $content);

But it didnt work.

12
  • What do you mean "it didn't work"? Was there an error? Did it not delete anything? Commented Nov 15, 2017 at 14:20
  • 1
    You should never parse HTML with regex. Use a PHP DOM parser instead. Commented Nov 15, 2017 at 14:21
  • 2
    @gtktuf first off, you're going to replace everything from the first instance to the last tr> so your regex is not going to do what you expect (you use greedy quantifiers .* instead of lazy quantifiers .*?). Second, your . doesn't match new line characters, you should use [\s\S] instead or turn on the s flag to match newline characters with the . character. Again, though, you shouldn't even be using regex for this. Commented Nov 15, 2017 at 14:24
  • 1
    @gtktuf you really should be using something like this question does. Commented Nov 15, 2017 at 14:26
  • 1
    @gtktuf yes. It's usually bad practice to parse HTML or XML with regex. Regex should only be used for parsing HTML or XML if it's a known subset. In your case it doesn't appear to be so. I would recommend you use an HTML/XML parser and have it do the heavy lifting for you. Commented Nov 15, 2017 at 14:29

1 Answer 1

4

One way would be to use an xpath query:

*//td[contains(., 'Pricetoreplace')]/parent::tr

Here, we look for a td which text() property contains Pricetoreplace and then look up the corresponding parent tr. The latter will be removed from the DOM.


In PHP:

<?php

$html = <<<DATA
    <tr><td class="some other class">some text here</td></tr>
   <tr>
        <td width="300" bgcolor="#cccccc" style="text-align: right;">
         <strong>&nbsp;&nbsp;&nbsp;Sometext<br />
         </strong>
        </td>
        <td width="125" bgcolor="#009900" style="text-align: center;">
         <strong><span style="color: rgb(255, 255, 255);">
          <span style="font-size: larger;">Pricetoreplace</span>
          </span>
         </strong>
        </td>
    </tr>
DATA;

# set up the DOM
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);

# set up the xpath
$xpath = new DOMXPath($dom);

foreach ($xpath->query("*//td[contains(., 'Pricetoreplace')]/parent::tr") as $row) {
    $row->parentNode->removeChild($row);
}
echo $dom->saveHTML();
?>


This yields

<tr><td class="some other class">some text here</td></tr>
Sign up to request clarification or add additional context in comments.

2 Comments

That's the answer, but in my case i need to replace: $dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED); $dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8')); to solve some problems with encoding. And there's no classes like: class="some other class" in the whole posts, wich i need to rebuild with this php script-that was the main problem. Ty for this method.
@gtktuf: Glad to help.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.