Parsing html using php to an array

Question

I have the below html

<p>text1</p>
 <ul>
   <li>list-a1</li>
   <li>list-a2</li>
   <li>list-a3</li>
 </ul>
<p>text2</p>
 <ul>
   <li>list-b1</li>
   <li>list-b2</li>
   <li>list-b3</li>
 </ul>
<p>text3</p>

Does anyone have an idea to parse this html file with php to get this output using complex array fist one for the tags "p" and the second for tags "ul" because after above every "p" tag a tag "ul"

Array
(
    [0] => Array
        (
            [value] => text1
                (
                    [il] => list-a1
                    [il] => list-a2
                    [il] => list-a3
                  )

                )
    [1] => Array
        (
            [value] => text2
                (
                    [il] => list-b1
                    [il] => list-b2
                    [il] => list-b3
                  )

                )
             )

I can't use replace or removing all tags cause I use

foreach ($doc->getElementsByTagName('p') as $link) 
{
    $dont = $link->textContent;
    if (strpos($dont, 'document.') === false) {
        $links2[] = array(
            'value' => $link->textContent, );
    }
$er=0;

foreach ($doc->getElementsByTagName('ul') as $link) 
{

$dont2 = $link->nodeValue;
//echo $dont2;
if (strpos($dont2, 'favorisContribuer') === false) {
  $links3[]= array(
       'il' => $link->nodeValue, );

}

Possible duplicate of How do you parse and process HTML/XML in PHP? — Sean
– Sean, Commented Jan 3, 2017 at 1:34
I think strip_tags would do it, that's not really parsing but you don't really seem to care about what the elements are. If in a browser use nl2br after the strip. — chris85
– chris85, Commented Jan 3, 2017 at 2:12
thank you for the replay but I can't use replace or removing all tags cause i have long html code not only the displayed one i wanna solution with dom method . — filip
– filip, Commented Jan 3, 2017 at 7:06

Andrew Larsen · Accepted Answer · 2017-01-03 01:37:03Z

1

You could use the DOMDocument class (http://php.net/manual/en/class.domdocument.php)

You can see an example below.

<?php

$html = '
    <p>text1</p>
    <ul>
        <li>list-a1</li>
        <li>list-a2</li>
        <li>list-a3</li>
    </ul>
    <p>text2</p>
    <ul>
        <li>list-b1</li>
        <li>list-b2</li>
        <li>list-b3</li>
    </ul>
    <p>text3</p>
';

$doc = new DOMDocument();
$doc->loadHTML($html);

$textContent = $doc->textContent;
$textContent = trim(preg_replace('/\t+/', '<br>', $textContent));

echo '
    <!DOCTYPE html>
    <html>
    <head>
        <title></title>
    </head>
    <body>
        ' . $textContent . '
    </body>
    </html>
';

?>

However, I would suggest using javascript to find the content and send it to php instead.

answered Jan 3, 2017 at 1:37

Andrew Larsen

1,26711 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Andrew Larsen Over a year ago

You asked for a php solution, so I provided a php only solution. As I said I would suggest using javascript for this task.

Collectives™ on Stack Overflow

Parsing html using php to an array

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related