1

I have the following XML code which I want to read and get the value inside "content" tag.

"<?xml version='1.0' encoding='ISO-8859-1'?>
                <ad modelVersion='0.9'>
                    <richmediaAd>
                        <content>
                            <![CDATA[<script src=\"mraid.js\"></script> 
                                <div class=\"celtra-ad-v3\"> 
                                    <img src=\"data: image/png, celtra\" style=\"display: none\"onerror=\"(function(img){ varparams={ 'channelId': '45f3f23c','clickUrl': 'http%3a%2f%2fexamplehost.com%3a53766%2fCloudMobRTBWeb%2fClickThroughHandler.ashx%3fadid%3de6983c95-9292-4e16-967d-149e2e77dece%26cid%3d352%26crid%3d850'};varreq=document.createElement('script');req.id=params.scriptId='celtra-script-'+(window.celtraScriptIndex=(window.celtraScriptIndex||0)+1);params.clientTimestamp=newDate/1000;req.src=(window.location.protocol=='https: '?'https': 'http')+': //ads.celtra.com/e7f5ce18/mraid-ad.js?';for(varkinparams){req.src+='&amp;'+encodeURIComponent(k)+'='+encodeURIComponent(params[ k ]); }img.parentNode.insertBefore(req, img.nextSibling);})(this);\"/> 
                                </div>]]>
                        </content>
                        <width>320</width>
                        <height>50</height>
                    </richmediaAd>
                </ad>"

I tried 2 methods (SimpleXML and DOM). I managed to get the value but found the keyword "CDATA" missing. What I got inside "content" tag was:

 <script src="mraid.js"></script> 
     <div class="celtra-ad-v3"> 
         <img src="data: image/png, celtra" style="display: none"onerror="(function(img){ varparams={ 'channelId': '45f3f23c','clickUrl': 'http%3a%2f%2fexamplehost.com%3a53766%2fCloudMobRTBWeb%2fClickThroughHandler.ashx%3fadid%3de6983c95-9292-4e16-967d-149e2e77dece%26cid%3d352%26crid%3d850'};varreq=document.createElement('script');req.id=params.scriptId='celtra-script-'+(window.celtraScriptIndex=(window.celtraScriptIndex||0)+1);params.clientTimestamp=newDate/1000;req.src=(window.location.protocol=='https: '?'https': 'http')+': //ads.celtra.com/e7f5ce18/mraid-ad.js?';for(varkinparams){req.src+='&amp;'+encodeURIComponent(k)+'='+encodeURIComponent(params[ k ]); }img.parentNode.insertBefore(req, img.nextSibling);})(this);"/> 
     </div>

I know the parser was trying to sort of "beautify" the XML by removing CDATA. But what I want is just the raw data with "CDATA" tag in it. Is there any way to achieve this? Appreciate your help.

And below is my 2 methods for your reference: Method 1:

$type = simplexml_load_string($response['adm']) or die("Error: Cannot create object");
$data = $type->richmediaAd[0]->content;
Yii::warning((string) $data);
Yii::warning(strpos($data, 'CDATA'));

Method 2:

$doc = new \DOMDocument();
$doc->loadXML($response['adm']);
$richmediaAds = ($doc->getElementsByTagName("richmediaAd"));
foreach($richmediaAds as $richmediaAd){
    $contents = $richmediaAd->getElementsByTagName("content");
    foreach($contents as $content){
         Yii::warning($content->nodeValue);
    }
}

1 Answer 1

1

I'll improve this if I can, but you can target explicitly the "CDATA Section" node of your content element and use $doc->saveXML( $node ) with the node as the parameter to get that exact XML element structure.

$doc = new \DOMDocument();
$doc->loadXML( $xml );

$xpath = new \DOMXPath( $doc );
$nodes = $xpath->query( '/ad/richmediaAd/content');

foreach( $nodes[0]->childNodes as $node )
{
  if( $node->nodeType === XML_CDATA_SECTION_NODE )
  {
    echo $doc->saveXML( $node ); // string content
  }
}

Edit: You may wish to support some redundancy if there is no CDATA found.


Without XPATH

$doc = new \DOMDocument();
$doc->loadXML( $xml );
$doc->normalize();

foreach( $doc->getElementsByTagName('content')->item(0)->childNodes as $node )
{
  if( $node->nodeType === XML_CDATA_SECTION_NODE )
  {
    echo $doc->saveXML( $node ); // string content
  }
}
Sign up to request clarification or add additional context in comments.

3 Comments

Hi Scuzzy! Thank you for your answer. Really appreciate it. I got DOMXPath' not found error when I tried to run the code. I'm using yii framework and I'm not sure how to install DOMXPath...
Strange... DOMXPath should be installed by default with DOMDocument. try new \DOMXPath( $doc ) it could be a namespacing issue. Note that I'm only using DOMXPath to cut down on the amount of code used to select the element, as long as you can locate the "content" node, you can use the foreach approach to locate the CDATA node.
Wah it works! Thank you so much you saved my life :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.