0

I have an XML file that contains XML within it. How would I go about parsing everything into an array or object?

<DATA>
    <ROW>
        <id>1</id>
        <message_id>123456789</message_id>
        <brand_name>SAMPLE</brand_name>
        <request_xml>
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;Email&gt;
&lt;Service&gt;
&lt;LogMessage/&gt;
&lt;Delivery&gt;
&lt;Synchronous/&gt;
&lt;/Delivery&gt;
&lt;/Service&gt;
&lt;Model&gt;
&lt;Head&gt;
&lt;From&gt;[email protected] &lt;/From&gt;
&lt;To&gt;[email protected]&lt;/To&gt;
&lt;Subject&gt;Your Question&lt;/Subject&gt;
&lt;/Head&gt;
&lt;ns2:ContactUs&gt;
&lt;ns2:Sender&gt;
&lt;ns2:FirstName&gt;John&lt;/ns2:FirstName&gt;
&lt;/ns2:Sender&gt;
&lt;/ns2:ContactUs&gt;
&lt;/Model&gt;
&lt;InlineImages/&gt;
&lt;History/&gt;
&lt;/Email&gt;
        </request_xml>
        <http_status>400</http_status>
        <created_by>admin</created_by>
        <created_on>2014-09-08 01:56:59</created_on>
    </ROW>
</DATA>

My goal is to end up with somthing like that:

SimpleXMLElement Object
(
    [ROW] => SimpleXMLElement Object
        (
            [id] => 1
            [message_id] => 123456789
            [brand_name] => SAMPLE
            [request_xml] => SimpleXMLElement Object
                (
                    ...
                    [LogMessage] => 
                    ...
                    [from] => [email protected]
                    ...
                )

            [http_status] => 400
            [created_by] => admin
            [created_on] => 2014-09-08 01:56:59
        )
)

I didn't put all levels of the request_xml in my example, but you get the idea. Basically I want that request_xml to be parsed like the rest of the XML file.

How could I achieve this? Thanks in advance for any help on this!

4
  • 1
    Why not just parse main string in SimpleXML, extract substring, parse that as SimpleXML and add sub XML as node to main XML? Commented Sep 10, 2014 at 9:43
  • How would I deal with the encoding? Commented Sep 10, 2014 at 9:44
  • 1
    You could either use html_entity_decode or str_replace with a specific set of entities. Commented Sep 10, 2014 at 9:49
  • 1
    To deal with the encoding, I ended up using simplexml_load_string($xmlfile, 'SimpleXMLElement', LIBXML_NOENT); so I didn't need to use html_entity_decode. Commented Sep 10, 2014 at 13:39

1 Answer 1

2

If you read the node value of request_xml element node, the result will have the entities decoded.

$outer = new DOMDocument();
$outer->loadXml($xml);
$xpath = new DOMXpath($outer);

$innerXml = $xpath->evaluate('string(/DATA/ROW/request_xml)');
echo $innerXml;

Output:

<?xml version="1.0" encoding="UTF-8"?>
<Email>
<Service>
<LogMessage/>
...

You can load the inner xml into a separate document object.

$inner = new DOMDocument();
$inner->loadXml(trim($innerXml));
echo $inner->saveXml();

But in your example the inner XML is broken. It is missing the namespace definition for the ns2 prefix. If added it will work:

Demo: https://eval.in/191279

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your answer. I have omitted the namespace definition in my question but it actually is there. But for some reason, simplexml_load_string() fails to parse these ns2 tags. But that is not a big issue. What I really care is the content of the fields, so I can do a quick str_replace on the tags before parsing.
SimpleXMLs namespace handling is a little complex, you can register your own prefixes using SimpleXMLElement::registerXpathNamespace() and use them with xpath(), some of the methods are namespace aware, too. But it is only valid for a that element, on a child element your have to register them again. DOM uses a separate object for Xpath so you have to register them only once. stackoverflow.com/a/25571382/2265374

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.