10

If I use the following php code to convert an xml to json:

<?php

header("Content-Type:text/json");

$resultXML = "
<QUERY>
   <Company>fcsf</Company>
   <Details>
      fgrtgrthtyfgvb
   </Details>
</QUERY>
";

$sxml = simplexml_load_string($resultXML);
echo  json_encode($sxml);
?>

I get

{"Company":"fcsf","Details":"\n      fgrtgrthtyfgvb\n   "}

However, If I use CDATA in the Details element as follows:

<?php

header("Content-Type:text/json");

$resultXML = "
<QUERY>
   <Company>fcsf</Company>
   <Details><![CDATA[
      fgrtgrthtyfgvb]]>
   </Details>
</QUERY>
";

$sxml = simplexml_load_string($resultXML);
echo  json_encode($sxml);

?>

I get the following

{"Company":"fcsf","Details":{}}

In this case the Details element is blank. Any idea why Details is blank and how to correct this?

1
  • Did you try to remove ![CDATA[ and ]] before? $resultXML =str_replace('<![CDATA[', '', $resultXML); $resultXML =str_replace(']]>', '', $resultXML); Commented Feb 4, 2014 at 10:22

1 Answer 1

32

This is not a problem with the JSON encoding – var_dump($sxml->Details) shows you that SimpleXML already messed it up before, as you will only get

object(SimpleXMLElement)#2 (0) {
}

– an “empty” SimpleXMLElement, the CDATA content is already missing there.

And after we figured that out, googling for “simplexml cdata” leads us straight to the first user comment on the manual page on SimpleXML Functions, that has the solution:

If you are having trouble accessing CDATA in your simplexml document, you don't need to str_replace/preg_replace the CDATA out before loading it with simplexml.

You can do this instead, and all your CDATA contents will be merged into the element contents as strings.

$xml = simplexml_load_file($xmlfile, 'SimpleXMLElement', LIBXML_NOCDATA);

So, use

$sxml = simplexml_load_string($resultXML, 'SimpleXMLElement', LIBXML_NOCDATA);

in your code, and you’ll get

{"Company":"fcsf","Details":"\n      fgrtgrthtyfgvb\n   "}

after JSON-encoding it.

Sign up to request clarification or add additional context in comments.

2 Comments

The first sentence of this answer is wrong: SimpleXML has parsed the CData node just fine, but neither var_dump nor json_encode output it. If you access it directly, asking for the string content with (string), you will see it is there just fine: 3v4l.org/c2FoQ Blindly converting XML to JSON is simply not one of the design goals of SimpleXML, and this is just one of several problems you'll encounter trying to use it for that.
But this removes all html tags

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.