1

I am trying to convert an XML string into multi-dimensioned PHP array. The difficulties are that XML comes with attributes and has nested values. My code works at parent level data but I am not sure how to deal with sub-levels recursively.

A sample XML is as follows:

<root>
    <item id="1" name="ItemOne">Item Value 1</item>
    <item id="2" name="ItemTwo">
        <subb>Sub Value 1</subb>
        <subb>Sub Value 2</subb>
    </item>
    <item id="3" name="ItemThree">Value 3</item>
    <something>something value</something>
</root>

This is my current function to achieve this:

$xmlString = '<root><item id="1" name="ItemOne">Item Value 1</item><item id="2" name="ItemTwo"><subb>Sub Value 1</subb><subb>Sub Value 2</subb></item><item id="3" name="ItemThree">Value 3</item><something>something value</something></root>';

function xmlToArray($xmlObject) {
    $array = [];
    foreach ($xmlObject as $item) {
        // Convert attributes to an array
        $attributes = (array)$item->attributes();
        // Add the text value to the attributes array if it exists
        $attributes['@value'] = trim((string)$item);
        $key = (string)$item->getName();
        if (isset($make_sub_array) && in_array($key, $make_sub_array)) {
            $array[$key][] = $attributes;
        }
        elseif (isset($array[$key])) {
            $make_sub_array[] = $key;
            $tmp = $array[$key];
            unset($array[$key]);
            $array[$key][] = $tmp; //existing data
            $array[$key][] = $attributes; //this data
        }
        else $array[$key] = $attributes;
    }
    return $array;
}
// Load the XML string into a SimpleXMLElement object
$xmlObject = simplexml_load_string($xmlString);
$array = xmlToArray($xmlObject);
exit('<pre>'.print_r($array,1).'</pre>');

The resulting array structure is below, and I require your help about how I can process the array under second item. I would like it to be processed the same way as the parent one: if the item name is repeated then it will be included as [] so I get number as its parent, otherwise [itemname]. Thank you

Array
(
    [item] => Array
        (
            [0] => Array
                (
                    [@attributes] => Array
                        (
                            [id] => 1
                            [name] => ItemOne
                        )
                    [@value] => Item Value 1
                )
            [1] => Array
                (
                    [@attributes] => Array
                        (
                            [id] => 2
                            [name] => ItemTwo
                        )
                    [@value] => 
                )
            [2] => Array
                (
                    [@attributes] => Array
                        (
                            [id] => 3
                            [name] => ItemThree
                        )

                    [@value] => Value 3
                )
        )
    [something] => Array
        (
            [@value] => something value
        )
)
4
  • You need a recursive function to process nested items. Commented Aug 28 at 20:31
  • You loose your subb with this $attributes['@value'] = trim((string)$item); Commented Aug 28 at 21:06
  • If you need array, just do json_decode(json_encode($xmlObject),1) Commented Aug 28 at 21:09
  • As Barmar wrote, at least one recursive function or a callback that is called recursively, here an example with PHP standard JsonSerialize with fine-grained control: stackoverflow.com/a/79526811/367456 Commented Sep 1 at 5:47

2 Answers 2

2

If you replace your:

$attributes['@value'] = trim((string)$item);

by:

$attributes['@value'] = count($item->children()) ? xmlToArray($item) : trim((string)$item);

you'll get what you want:

<pre>Array
(
    [item] => Array
        (
            [0] => Array
                (
                    [@attributes] => Array
                        (
                            [id] => 1
                            [name] => ItemOne
                        )

                    [@value] => Item Value 1
                )

            [1] => Array
                (
                    [@attributes] => Array
                        (
                            [id] => 2
                            [name] => ItemTwo
                        )

                    [@value] => Array
                        (
                            [subb] => Array
                                (
                                    [0] => Array
                                        (
                                            [@value] => Sub Value 1
                                        )

                                    [1] => Array
                                        (
                                            [@value] => Sub Value 2
                                        )

                                )

                        )

                )

            [2] => Array
                (
                    [@attributes] => Array
                        (
                            [id] => 3
                            [name] => ItemThree
                        )

                    [@value] => Value 3
                )

        )

    [something] => Array
        (
            [@value] => something value
        )

)
</pre>

Note that hasChildren() will consistently return false, thus the test with count($item->children()) to determine if we should recursively explore children, or handle a simple element with text contents.

With <root> in output

Your current function will not output the root node as asked in an additional comment, because your foreach ($xmlObject as $item) only loops over children of the node it is passed, thus you get an array of children.

The function, that currently handles each child's contents in the loop, could be adapted to only handle the node's contents and then call itself recursively over each child, but:

  • this would call xmlToArray() for each XML element, which is a bit like cracking a nut with a sledgehammer
  • each call to xmlToArray() would then return its own array with 1 item indexed as [itemname], and you'd still have to re-merge items in the parent node's loop

A better option is to wrap your <root> as the single child of an imaginary super-root;
and as your xmlArray() only loops over $xmlObject's implicit iterator over its children (you don't use any of its object aspects, only its array-like aspect), you don't even have to make this level-0 parameter a full-fledged SimpleXMLElement, a simple array will do.

Thus simply change your:

xmlToArray($xmlObject);

to:

xmlToArray([ $xmlObject ]);

and you'll get your root:

--- /tmp/1  2025-08-29 16:16:23.174752000 +0200
+++ /tmp/2  2025-08-29 16:17:07.708928000 +0200
@@ -1,5 +1,9 @@
 <pre>Array
 (
+    [root] => Array
+        (
+            [@value] => Array
+                (
                     [item] => Array
                         (
                             [0] => Array
@@ -57,6 +61,10 @@
                     [something] => Array
                         (
                             [@value] => something value
+                        )
+
+                )
+
         )
 
 )
Sign up to request clarification or add additional context in comments.

2 Comments

Is it possible to preserve and include the outermost tag <root> as well?
Done, in the second part of my answer!
-1

Let me show my function, that was written many years ago and still I'm using it.

function load_xml($xml_file){
    $p=file_get_contents($xml_file);
    
    //XML does not allow ampersands, but I need them
    $p=str_replace('&amp;', '##AMPERSAND##', $p); //Good amps to temp word
    $p=str_replace('&', '##AMPERSAND##', $p); //Bad amps to temp word
    $p=str_replace('##AMPERSAND##', '&amp;', $p); //Repair all amps
    
    //Remove comments, or they will become an elements
    //The worst thing that if you have element
    //<comment>some1</comment> and <!-- some2 --> nearby
    //You'll get an array like if you have elements with same names:
    //['comment'=>['some1','some2']]
    $p=preg_replace('/<!--(.*?)-->/s','',$p);
    
    //Now loading
    $p=simplexml_load_string($p);
    if($p===false){ //Your error processing here
        echo "<font color=red><b>ERROR</b></font>\n", libxml_get_errors();
        return false;
    }

    //And now converting object to array
    $j=json_encode($p,JSON_UNESCAPED_UNICODE); //to JSON

    //I'd like to make subelements from attributes
    $j=str_replace('\"', '&quot;',$j); //hide escaped quotes
    $j=preg_replace('/"@attributes":{((".*?":".*?",?)+)}/','$1',$j); //attrs to subelements
    $j=str_replace('&quot;', '\"', $j); //quotes back

    //Array from JSON
    $p=json_decode($j,1);
    array_walk_recursive($p,function(&$d,$k){ $d=trim($d); }); //trim spaces
        

    //If you don't need empty elements, once more Array->JSON->Array
    $j=json_encode($p,JSON_UNESCAPED_UNICODE);
    $j=preg_replace('/\[""\]/','[]',$j); //Remove empty strings from arrays
    $j=preg_replace('/,"0":""/','',$j); //Remove empty elements
    $p=json_decode($j,1);

    return $p;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.