1

I would extract some section html code from a string. This string is returned from a php joomla page. This string contain code like this:

<!-- JoomlaWorks "Disqus Comment System for Joomla!" Plugin (v2.2) starts here -->

<div class="itp-fshare-floating" id="itp-fshare" style="position:fixed; top:30px !important; left:50px !important;">
</div>
<p>
    <span class="easy_img_caption"  style="display:inline-block;line-height:0.5;vertical-align:top;background-color:#F2F2F2;text-align:left;width:180px;float:left;margin:0px 10px;">

        <a href="/joomla/index.php?option=com_content&view=article&id=13:11111&catid=1:guide-sui-serivzi-cloud-computing&Itemid=3">
            <img src="/joomla/plugins/content/imagesresizecache/441a27b2a4d64b487a8e213a94f6466d.jpeg" border="0" alt="1" title="1"  style="width:180px; height:150px; ;margin:0;" />
        </a>
        <span class="easy_img_caption_inner" style="display:inline-block;line-height:normal;color:#000000;font-size:8pt;font-weight:normal;font-style:normal;padding:4px 8px;margin:0px;">1

        </span>
    </span>

    11111111111111111111111111111111111
</p>

<!-- Disqus comments counter and anchor link -->

<a class="jwDisqusListingCounterLink" href="http://clouderize.it/joomla/index.php?option=com_content&view=article&id=13:11111&catid=1:guide-sui-serivzi-cloud-computing&Itemid=3#disqus_thread" title="Add a comment">
    Add a comment
</a>

<!-- JoomlaWorks "Disqus Comment System for Joomla!" Plugin (v2.2) ends here -->

I would extract this section:

    <span class="easy_img_caption"  style="display:inline-block;line-height:0.5;vertical-align:top;background-color:#F2F2F2;text-align:left;width:180px;float:left;margin:0px 10px;">

        <a href="/joomla/index.php?option=com_content&view=article&id=13:11111&catid=1:guide-sui-serivzi-cloud-computing&Itemid=3">
            <img src="/joomla/plugins/content/imagesresizecache/441a27b2a4d64b487a8e213a94f6466d.jpeg" border="0" alt="1" title="1"  style="width:180px; height:150px; ;margin:0;" />
        </a>
        <span class="easy_img_caption_inner" style="display:inline-block;line-height:normal;color:#000000;font-size:8pt;font-weight:normal;font-style:normal;padding:4px 8px;margin:0px;">1

        </span>
    </span>

How can I do? Thanks a lot.

MODIFY1:

$content="<html><head></head><body>".($this->item->text)."</body></html>";
        $dom = new DOMDocument();
        $dom->loadHTML($content);

        $xpath = new DOMXPath($dom);

        $tags = $xpath->query('//span[@class="easy_img_caption"]/');
        print_r($tags);

MODIFY2: With this code:

$content="<html><head></head><body>".($this->item->text)."</body></html>";
        $content=($this->item->text);
        $dom = new DOMDocument();
        $dom->loadHTML($content);

        $xpath = new DOMXPath($dom);

        $tags = $xpath->query('//span[@class="easy_img_caption"]');
        //echo "<textarea>".print_r($dom->saveXml($tags->item(0)))."</textarea>";
        foreach ($tags as $tag) {
            $innerHTML = '';
            $children = $tag->childNodes;
            foreach ($children as $child) {
                $tmp_doc = new DOMDocument();
                $tmp_doc->appendChild($tmp_doc->importNode($child,true));       
                $innerHTML .= $tmp_doc->saveHTML();
            }

            echo $innerHTML;

I have returned:

<a href="/joomla/index.php?option=com_content&view=article&id=13:11111&catid=1:guide-sui-serivzi-cloud-computing&Itemid=3">
    <img src="/joomla/plugins/content/imagesresizecache/441a27b2a4d64b487a8e213a94f6466d.jpeg" border="0" alt="1" title="1" style="width:180px; height:150px; ;margin:0;">
</a>
<span class="easy_img_caption_inner" style="display:inline-block;line-height:normal;color:#000000;font-size:8pt;font-weight:normal;font-style:normal;padding:4px 8px;margin:0px;">1</span>

The problem is that I want also the previous span:

<span class="easy_img_caption"  style="display:inline-block;line-height:0.5;vertical-align:top;background-color:#F2F2F2;text-align:left;width:180px;float:left;margin:0px 10px;">

What modify I have to do at xpath query?

Thanks again.

14
  • (related) Best Methods to parse HTML Commented Nov 6, 2011 at 12:46
  • I am not very good with dom and xmlreader...can you help me with some code? Commented Nov 6, 2011 at 12:50
  • sure. have a look at stackoverflow.com/search?q=dom+html+user%3A208809 Commented Nov 6, 2011 at 12:53
  • Is my xpath query wrong? Commented Nov 6, 2011 at 13:04
  • remove the trailing slash and print with $dom->saveXml($tags->item(0));. The query methods return a DOMNodeList. Also, when you use loadHTML you dont have to add the HTML skeleton around $this->item because DOM will add that automatically. Commented Nov 6, 2011 at 13:06

1 Answer 1

2

Here is solution for your test string using PHPQuery library http://code.google.com/p/phpquery/:

<?php

require('phpQuery/phpQuery.php');

$testString =

'<!-- JoomlaWorks "Disqus Comment System for Joomla!" Plugin (v2.2) starts here -->

<div class="itp-fshare-floating" id="itp-fshare" style="position:fixed; top:30px !        important; left:50px !important;">
</div>
<p>
<span class="easy_img_caption"  style="display:inline-block;line-height:0.5;vertical-align:top;background-color:#F2F2F2;text-align:left;width:180px;float:left;margin:0px 10px;">

    <a href="/joomla/index.php?option=com_content&view=article&id=13:11111&catid=1:guide-sui-serivzi-cloud-computing&Itemid=3">
        <img src="/joomla/plugins/content/imagesresizecache/441a27b2a4d64b487a8e213a94f6466d.jpeg" border="0" alt="1" title="1"  style="width:180px; height:150px; ;margin:0;" />
    </a>
    <span class="easy_img_caption_inner" style="display:inline-block;line-height:normal;color:#000000;font-size:8pt;font-weight:normal;font-style:normal;padding:4px 8px;margin:0px;">1

    </span>
</span>

11111111111111111111111111111111111
</p>

<!-- Disqus comments counter and anchor link -->

<a class="jwDisqusListingCounterLink" href="http://clouderize.it/joomla/index.php?option=com_content&view=article&id=13:11111&catid=1:guide-sui-serivzi-cloud-computing&Itemid=3#disqus_thread" title="Add a comment">
Add a comment
</a>

<!-- JoomlaWorks "Disqus Comment System for Joomla!" Plugin (v2.2) ends here -->';

$doc = phpQuery::newDocument($testString);

$extraction=pq('.easy_img_caption:eq(0)')->htmlOuter();

echo  $extraction;

/* outputs
<span class="easy_img_caption" style="display:inline-block;line-height:0.5;vertical-    align:top;background-color:#F2F2F2;text-align:left;width:180px;float:left;margin:0px 10px;">

    <a href="/joomla/index.php?option=com_content&amp;view=article&amp;id=13:11111&amp;catid=1:guide-sui-serivzi-cloud-computing&amp;Itemid=3">
        <img src="/joomla/plugins/content/imagesresizecache/441a27b2a4d64b487a8e213a94f6466d.jpeg" border="0" alt="1" title="1" style="width:180px; height:150px; ;margin:0;"></a>
    <span class="easy_img_caption_inner" style="display:inline-block;line-height:normal;color:#000000;font-size:8pt;font-weight:normal;font-style:normal;padding:4px 8px;margin:0px;">1

    </span>
</span>
*/

?>
Sign up to request clarification or add additional context in comments.

1 Comment

Hi, but with DOM how can I do? Thanks I appreciate this, but I would prefer not to use external libs.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.