1

What I'm trying to do is get the content HTML inside a div class but don't remove the HTML tags inside the DIV but remove the div query example remove class=test1 only. Example HTML:

<div class="xx1">Some Extra Test<div class="test1">Test1<div class="test2"></div>Some Text</div></div>

I need the output to be

Test1<div class="test2"></div>Some Text

The PHP what I'm testing but this PHP code is deleting all HTML tags inside the div and is output only the text

 $html = '<div class="test1">Test1<div class="test2"></div>Some Text</div>';
 
 function DOC_Change_Data( $HTML = '', $Type = '', $Data = '', $Extra_Data = '' ) {
  if($HTML != '' && $Type != '' && $Data != '') {
   $doc                                                                                                              = new DOMDocument("1.0","UTF-8");
   $doc->preserveWhiteSpace                                                                                          = false;
   @$doc->loadHTML('<?xml encoding="utf-8" ?>' . $HTML, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED); // @ if for suppressing warnings
   $xpath = new DOMXPath($doc);
   
   if(preg_match("#remove_only_tag#is", $Extra_Data)) {
    if($Type == 'query') {
     $nodes                                                                                                            = $xpath->query($Data);
    } else if($Type == 'getElementsByTagName') {
     $nodes                                                                                                            = $doc->getElementsByTagName($Data);
    }
    
    foreach($nodes as $node) {
     $prent                                                                                                            = $node->parentNode;
     $prent->replaceChild($doc->createTextNode($node->nodeValue), $node);
    }
    
    $GET_Node_Data                                                                                                    = str_replace("<?xml encoding=\"utf-8\" ?>", "", $doc->saveHTML());
    
    return $GET_Node_Data;
   }
  }
 }


 echo DOC_Change_Data( $html, 'query', '//div [@class="test1"]', 'remove_only_tag' ) . "\n\n";

2 Answers 2

1

I would approach the issue by using xpath to select the target element, convert it to string and then use string manipulation methods to get to the desired output. It's a bit convoluted, but the code below should work for both html strings (the one in your question and the one in your comment); however, depending on the html structure, may have to be modified:

$html1 = <<<HTML
    <div class="test1">Test1<div class="test2"></div>Some Text</div>
    HTML;
$html2 = <<<HTML
    <div class="xx1">Some Extra Test<div class="test1">Test1<div class="test2"></div>Some Text</div></div>
    HTML;
$doc = new DOMDocument;
$doc->loadhtml($html2); //or html1
$xpath = new DOMXPath($doc); 

#get the right element using xpath
$node= $xpath->query('//div[@class="test1"]');

#convert the element to string
$target = $node[0]->ownerDocument->saveHTML($node[0]);

#now manipulate the string; it can be done in one step, but I broke it into two for clarity
$step1=explode('class="test1">',$target);
$step2 = explode('</div>',$step1[1]);
echo implode("", $step2);

Output in either case:

Test1<div class="test2">Some Text
Sign up to request clarification or add additional context in comments.

Comments

1

You can not use createTextNode because by definition this is only text and will not contain HTML tags.

The correct approach should be:

  1. You loop all the children of your $node. So in your case you have 3 children:
  • The text element Test1
  • The HTML tag <div class="test2"></div>
  • The text element Some Text
  1. For each of these children, you move them up 1 level: from being contained within your $node, to being attached to its parent. To do so, you clone the node, then you insertBefore to the $prent
  2. You remove the initial $node

In code:

foreach ($nodes as $node) {
    $prent = $node->parentNode;
    foreach ($node->childNodes as $childNode) {
        $prent->insertBefore($childNode->cloneNode(true), $node);
    }
    $prent->removeChild($node);
}

This will output, as requested

Test1<div class="test2"></div>Some Text

1 Comment

I try using this html <div class="xx1">Some Extra Test<div class="test1">Test1<div class="test2"></div>Some Text</div></div> but i get <div class="xx1">Some Extra TestTest1<div class="test2"></div>Some Text</div> but i need to get still only Test1<div class="test2"></div>Some Text , how i can fix this to? Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.