9

Can anyone help with converting data from an XML document into an associative array? I'm running into issues given that the XML structure is sort of 3D and the array is more of a 2D structure (please forgive my lack of correct terminology throughout). The XML elements have attributes, children and grand-children (but I never know their names), so I figured I'd try to make the key in the array a concatenation of each child/attribute name and the value equal to, well, the value. Trouble is I need the attribute name and value as part of the concatenated array key to make it unique...

For example:

<Computer id="1">   
    <OS>
        <Name>Linux</Name>
        <Age>Older than me</Age>
    </OS>
</Computer>
<Computer id="2">
    <OS>
        <Name>Windows</Name>
        <Age>Not so much</Age>
    </OS>
</Computer>

Should ideally give:

[Computer-id-1-OS-Name] = 'Linux'
[Computer-id-1-OS-Age] = 'Older than me'
[Computer-id-2-OS-Name] = 'Windows'
[Computer-id-2-OS-Age] = 'Not so much'

But I'm getting this result:

[Computer-id] = '1'
[Computer-OS-Name] = 'Linux'
[Computer-OS-Age] = 'Older than me'
[Computer-id] = '2'
[Computer-OS-Name] = 'Windows'
[Computer-OS-Age] = 'Not so much'

So that the [Computer-id] key is not unique. I'm using a recursive function to read in the values, but I can't figure how to get the attribute name and attribute value into the name of the subordinate keys...(By the way there is a good reason for doing this seemingly illogical task!) Any help would be greatly appreciated...

Here is the function which 'flattens' the XML data after it has been read into a multi-dimensional array. I'm not sure I'm going about this the right way!

function flattenArray ($array, $baseName = NULL)
{
    reset($array);
    while (list ($key, $value) = each($array)) {
        $outKey = $key . "-";
        if (is_array($value)) {
            flattenArray($value, $baseName . $outKey);
        } else {
            $finalKey = $baseName . rtrim($outKey, '-');
            $finalValue = $value;
            echo "$finalKey = $finalValue\n";
        }
    }
}
3
  • 1
    Can you post the code that's giving the incorrect output? Commented Jun 30, 2011 at 11:10
  • 1
    using an XML lib could help. php.net/manual/en/refs.xml.php Commented Jun 30, 2011 at 11:11
  • can you please explain why you want to do that? Why cant you use a tree structure as offered by DOM or SimpleXml? Commented Jun 30, 2011 at 11:19

6 Answers 6

46

This worked great for me, and it was simple.

$ob = simplexml_load_file('test.xml');
$json = json_encode($ob);
$array = json_decode($json, true);
Sign up to request clarification or add additional context in comments.

1 Comment

It works great, but fails with CDATA. In order to support CDATA, see this filter: php.net/manual/en/function.simplexml-load-string.php#82686
5

here's my function to generate associated array, derived from

Recursive cast from SimpleXMLObject to Array

function xml2assoc($obj, &$arr) {
  $children = $obj->children();
  foreach ( $children as $elementName => $node ) {

    if (!isset($arr[$elementName])) {
      $arr[$elementName] = array();
    }
    $temp = array();
    $attributes = $node->attributes();
    foreach ( $attributes as $attributeName => $attributeValue ) {
      $attribName = strtolower(trim((string) $attributeName));
      $attribVal = trim((string) $attributeValue);
      $temp[$attribName] = $attribVal;
    }
    $text = (string) $node;
    $text = trim($text);
    if (strlen($text) > 0) {
      $temp ['text='] = $text;
    }
    $arr[$elementName][] = $temp;
    $nextIdx = count($arr[$elementName]);
    xml2assoc($node, $arr[$elementName][$nextIdx - 1]);
  }
  return;
}

$xml = '<xml>
<ToUserName><![CDATA[toUser]]></ToUserName>
<FromUserName><![CDATA[fromUser]]></FromUserName>
<CreateTime>12345678</CreateTime>
<MsgType><![CDATA[news]]></MsgType>
<ArticleCount>2</ArticleCount>
<Articles>
<item>
<Title><![CDATA[title1]]></Title> 
<Description><![CDATA[description1]]></Description>
<PicUrl><![CDATA[picurl]]></PicUrl>
<Url><![CDATA[url]]></Url>
</item>
<item>
<Title><![CDATA[title]]></Title>
<Description><![CDATA[description]]></Description>
<PicUrl><![CDATA[picurl]]></PicUrl>
<Url><![CDATA[url]]></Url>
</item>
</Articles>
</xml> ';

$dom = new SimpleXMLElement($xml);

$arr = array();

xml2assoc($dom, $arr);
print_r($arr);

generated array:

Array
(
    [ToUserName] => Array
        (
            [0] => Array
                (
                    [text=] => toUser
                )

        )

    [FromUserName] => Array
        (
            [0] => Array
                (
                    [text=] => fromUser
                )

        )

    [CreateTime] => Array
        (
            [0] => Array
                (
                    [text=] => 12345678
                )

        )

    [MsgType] => Array
        (
            [0] => Array
                (
                    [text=] => news
                )

        )

    [ArticleCount] => Array
        (
            [0] => Array
                (
                    [text=] => 2
                )

        )

    [Articles] => Array
        (
            [0] => Array
                (
                    [item] => Array
                        (
                            [0] => Array
                                (
                                    [Title] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => title1
                                                )

                                        )

                                    [Description] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => description1
                                                )

                                        )

                                    [PicUrl] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => picurl
                                                )

                                        )

                                    [Url] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => url
                                                )

                                        )

                                )

                            [1] => Array
                                (
                                    [Title] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => title
                                                )

                                        )

                                    [Description] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => description
                                                )

                                        )

                                    [PicUrl] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => picurl
                                                )

                                        )

                                    [Url] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text=] => url
                                                )

                                        )

                                )

                        )

                )

        )

)

Comments

4

One example could be:

$dom = new DOMDocument;
$dom->loadXML(
    '<root>
        <Computer id="1">   
            <OS>
                <Name>Linux</Name>
                <Age>Older than me</Age>
            </OS>
        </Computer>

        <Computer id="2">
            <OS>
                <Name>Windows</Name>
                <Age>Not so much</Age>
            </OS>
        </Computer>
    </root>'
);

$xpath = new DOMXPath($dom);
$result = array();

foreach ($xpath->query('//*[count(*) = 0]') as $node) {
    $path = array();
    $val = $node->nodeValue;

    do {
        if ($node->hasAttributes()) {
            foreach ($node->attributes as $attribute) {
                $path[] = sprintf('%s[%s]', $attribute->nodeName, $attribute->nodeValue);
            }
        }
        $path[] = $node->nodeName;
    }
    while ($node = $node->parentNode);

    $result[implode('/', array_reverse($path))] = $val;
}

print_r($result);

Output:

Array
(
    [#document/root/Computer/id[1]/OS/Name] => Linux
    [#document/root/Computer/id[1]/OS/Age] => Older than me
    [#document/root/Computer/id[2]/OS/Name] => Windows
    [#document/root/Computer/id[2]/OS/Age] => Not so much
)

Thats not exactly what you're looking for, but it's a start and can easily be tweaked to give different results.

Comments

2

Read the xml into a DOM object, loop through it, save results into an array. Is that simple.

1 Comment

This is low-value relative to other posts on this page. Looks more like a comment/hint than an answer.
1

Simple arrays may be 2d but multi-dimensional arrays can replicate a hierarchical structure like xml very easily.

Google 'associative multi-dimensional array php' for more info.

However, as has already been stated, PHP has a built-in xml parser so there shouldn't be any need to recreate the xml within an array anyhow, let alone flatten it to a simple array.

Within PHP your array structure should resemble this:

$computers["computers"]["computer-1"]["OS"]["Name"] = "Linux";
$computers["computers"]["computer-1"]["OS"]["Age"] = "Older Than Me";

$computers["computers"]["computer-2"]["OS"]["Name"] = "Windows";
$computers["computers"]["computer-2"]["OS"]["Age"] = "Not so much";

etc...

Comments

0

I modified user655000's answer to be closer to how json_decode(json_encode($dom)) would format/return the data. I also made the initial array parameter optional, since it's just going to be empty anyway.

I couldn't use the decode(encode) method as there appears to be bugs in PHP's encode function, which resulted in the decode() returning null on some sample data. I tried a safer version of the encode function, but it ran out of memory.

There is a minor behavior difference. the decode(encode) method will discard any attributes (possibly children too) if there is nodeText. My method does not.

function readxml($xmlfile, $recursive = false){
    $ob = simplexml_load_file($xmlfile);
    //primary method
    $json = json_encode($ob);
    $array = json_decode($json, true);
    if(is_null($array)){//backup method
        $array = xml2assoc($ob);
    }
    return $array;
}

function xml2assoc($obj, &$arr = null) {
    $children = $obj->children();//->count(); 
    $nodes = [];
    foreach ( $children as $elementName => $node ) {
        if(!isset($nodes[$elementName])){
            $nodes[$elementName] = 0;
        }
        $nodes[$elementName]++;
    }
    $indexes = [];

    if($arr === null){
        $arr = [];
    }
    foreach ( $children as $elementName => $node ) {
        $temp = array();
        $grandchildren = $node->children()->count();
        
        //attributes        
        $attributes = $node->attributes();
        foreach ( $attributes as $attributeName => $attributeValue ) {
            $attribName = trim((string) $attributeName);
            $attribVal = trim((string) $attributeValue);
            $temp["@attributes"][$attribName] = $attribVal;
        }
        
        //text      
        $text = (string) $node;
        $text = trim($text);
        if (strlen($text) > 0) {
            if(count($temp) == 0 && $grandchildren == 0){
                $temp = $text;//discard the children/attribute data since there aren't any
            } else {
                $temp["NodeText"] = $text;//retain the children/attributes
            }
        }       
        
        //grandchildren
        if($temp || is_string($temp) || $grandchildren > 0 ){
            if( $nodes[$elementName] == 1 ){//only one of it's kind
                $arr[$elementName] = $temp;
                xml2assoc($node, $arr[$elementName]);
            } else {//has multiple nodes of the same kind
                if(isset($indexes[$elementName])){
                    $indexes[$elementName]++;
                } else {
                    $indexes[$elementName] = 0;
                }
                $index = $indexes[$elementName];
                $arr[$elementName][$index] = $temp;
                xml2assoc($node, $arr[$elementName][$index]);
            }
        }
    }
    return $arr;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.