I am trying to develop a PHP script that replaces all divs in an HTML string with paragraphs except those which have attributes (e.g. <div id="1">). The first thing my script currently does is use a simple str_replace() to replace all occurrences of <div> with <p>, and this leaves behind any div tags with attributes and end div tags (</div>). However, replacing the </div> tags with </p> tags is a bit more problematic.
So far, I have developed a preg_replace_callback function that is designed to convert some </div> tags into </p> tags to match the opening <p> tags, but ignore other </div> tags when they are ending a <div> with attributes. Below is the script that I am using;
<?php
$input = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div>";
$input2 = str_replace("<div>", "<p>", $input);
$output = preg_replace_callback("/(<div )|(<\/div>)/", 'replacer', $input2);
function replacer($matches){
static $count = 0;
$counter=count($matches);
for($i=0;$i<$counter;$i++){
if($matches[$i]=="<div "){
return "<div ";
$count++;
} elseif ($matches[$i]=="</div>"){
$count--;
if ($count>=0){
return "</div>";
} elseif ($count<0){
return "</p>";
$count++;
}
}
}
}
echo $output;
?>
The script basically puts all the remaining <div> and </div> tags into an array and then loop through it. A counter variable is then incremented when it encounters a <div> tag or decremented when it encounters a </div> within the array. When the counter is less than 0, a </p> tag is returned, otherwise a </div> is returned.
The output of the script should be;
<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p>"
Instead the output I am getting is;
<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</p></p><p>I am fine.</p>
I have spent hours making as many edits to the script as I can think of, and I keep getting the same output. Can anyone explain to me where I am going wrong or offer an alternative solution?
Any help would be appreciated.
(?R)recursing regex. Doable, but not worth to be answered individually everytime someone asks. It's simpler if you just use a readymade solution like phpquery or querypath instead (html traversal frontends).