9

I'm currently processing an extensive XML file, to make some of the processing easier I've used the following method as mentioned extensively on stack overflow

$xml = simplexml_load_string($xml_string);
$json = json_encode($xml);
$array = json_decode($json,TRUE);

This has been awesome but going over my code I've noted some instances where attributes on certain elements aren't converting correctly, at this step $json = json_encode($xml);

Here is a stripped down XML example.

<?xml version="1.0"?>
<property>
    <landDetails>
        <area unit="squareMeter"/>
    </landDetails>
    <buildingDetails>
        <area unit="squareMeter">100</area>
    </buildingDetails>
</property>

and here is the output.

Array (
    [landDetails] => Array (
        [area] => Array (
            [@attributes] => Array (
                [unit] => squareMeter
            )
        )
    )
    [buildingDetails] => Array (
        [area] => 100
    )
)

As seen above if the element contains any info on that exact node the associated attributes with that element are not processed. This is causing significant data loss between the conversion.

Does anyone know how to solve this issue?

Thanks in advance!

1 Answer 1

4

The elements are processed, they are just not being displayed in the case where the node has attributes AND values. In that case, only the values are being displayed.

The json / array conversion you do is not taking that into account, and only keep the to-be displayed values. I'm afraid there is no trick to do that, but here is a function I used when I didn't know how to trickily convert SimpleXML elements (And which is handling the attributes and values separately)

function simplexml_to_array ($xml, &$array) {

  // Empty node : <node></node>
  $array[$xml->getName()] = '';

  // Nodes with children
  foreach ($xml->children() as $child) {
    simplexml_to_array($child, $array[$xml->getName()]);
  }

  // Node attributes
  foreach ($xml->attributes() as $key => $att) {
      $array[$xml->getName()]['@attributes'][$key] = (string) $att;
  }

  // Node with value
  if (trim((string) $xml) != '') {
    $array[$xml->getName()][] = (string) $xml; 
  }

}

$xml = simplexml_load_string($xml);
simplexml_to_array($xml, $arr);
var_dump($arr);

Output :

array(1) {
  ["property"]=>
  array(2) {
    ["landDetails"]=>
    array(1) {
      ["area"]=>
      array(1) {
        ["@attributes"]=>
        array(1) {
          ["unit"]=>
          string(11) "squareMeter"
        }
      }
    }
    ["buildingDetails"]=>
    array(1) {
      ["area"]=>
      array(2) {
        ["@attributes"]=>
        array(1) {
          ["unit"]=>
          string(11) "squareMeter"
        }
        [0]=>
        string(3) "100"
      }
    }
  }
}
Sign up to request clarification or add additional context in comments.

3 Comments

Just looking at this a bit more the above code doesn't seem to pass in an an attribute array on the first two xml child elements? For example using the above XML if you add a color="red" attribute on property, then a color="green" attribute on landDetails and buildingDetails and a color="blue" attribute on area. Only the blue attribute will be exported in the proceeding array.
@DevonMather Right, I don't know why this foreach loop was in the has_children check. In fact, this has_children check was not needed. I figured out that SimpleXML returns a string with spaces and linebreak. trim is enought to detect if it has no actual value. I tried this new code on several different XML structure, and it's seems ok now. :)
This code doesn't seem to work. <xml> <node1 attr1="111" attr2="222"> <node2 attr3="aaa" attr4="bbb"></node2> <node2 attr3="ccc" attr4="ddd">xxx</node2> </node1> </xml> Here aaa and bbb do not appear in the output.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.