1

I have this method that converts an XML string into a PHP array with different keys and values to fully make sense of that XML appropriately. However, when there are multiple children of the same kind, I'm not getting the desired result from the array and I'm confused on how to alter the method to do so.

This is what the method looks like:

/**
 * Converts a XML string to an array
 *
 * @param $xmlString
 * @return array
 */
private function parseXml($xmlString)
{
    $doc = new DOMDocument;
    $doc->loadXML($xmlString);
    $root = $doc->documentElement;
    $output[$root->tagName] = $this->domnodeToArray($root, $doc);

    return $output;
}

/**
 * @param $node
 * @param $xmlDocument
 * @return array|string
 */
private function domNodeToArray($node, $xmlDocument)
{
    $output = [];
    switch ($node->nodeType)
    {
        case XML_CDATA_SECTION_NODE:
        case XML_TEXT_NODE:
            $output = trim($node->textContent);
            break;
        case XML_ELEMENT_NODE:
            for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++)
            {
                $child = $node->childNodes->item($i);
                $v = $this->domNodeToArray($child, $xmlDocument);

                if (isset($child->tagName))
                {
                    $t = $child->tagName;

                    if (!isset($output['value'][$t]))
                    {
                        $output['value'][$t] = [];
                    }
                    $output['value'][$t][] = $v;
                }
                else if ($v || $v === '0')
                {
                    $output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
                }
            }

            if (isset($output['value']) && $node->attributes->length && !is_array($output['value']))
            {
                $output = ['value' => $output['value']];
            }

            if (!$node->attributes->length && isset($output['value']) && !is_array($output['value']))
            {
                $output = ['attributes' => [], 'value' => $output['value']];
            }

            if ($node->attributes->length)
            {
                $a = [];
                foreach ($node->attributes as $attrName => $attrNode)
                {
                    $a[$attrName] = (string)$attrNode->value;
                }
                $output['attributes'] = $a;
            }
            else
            {
                $output['attributes'] = [];
            }

            if (isset($output['value']) && is_array($output['value']))
            {
                foreach ($output['value'] as $t => $v)
                {
                    if (is_array($v) && count($v) == 1 && $t != 'attributes')
                    {
                        $output['value'][$t] = $v[0];
                    }
                }
            }
            break;
    }

    return $output;
}

Here is some example XML:

<?xml version="1.0" encoding="UTF-8"?>
<characters>
   <character>
      <name2>Sno</name2>
      <friend-of>Pep</friend-of>
      <since>1950-10-04</since>
      <qualification>extroverted beagle</qualification>
   </character>
   <character>
      <name2>Pep</name2>
      <friend-of>Sno</friend-of>
      <since>1966-08-22</since>
      <qualification>bold, brash and tomboyish</qualification>
   </character>
</characters>

Running the method and passing that XML as its parameter, will result with this array:

array:1 [▼
  "characters" => array:2 [▼
    "value" => array:1 [▼
      "character" => array:2 [▼
        0 => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "attributes" => []
              "value" => "Sno"
            ]
            "friend-of" => array:2 [▼
              "attributes" => []
              "value" => "Pep"
            ]
            "since" => array:2 [▼
              "attributes" => []
              "value" => "1950-10-04"
            ]
            "qualification" => array:2 [▼
              "attributes" => []
              "value" => "extroverted beagle"
            ]
          ]
          "attributes" => []
        ]
        1 => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "attributes" => []
              "value" => "Pep"
            ]
            "friend-of" => array:2 [▼
              "attributes" => []
              "value" => "Sno"
            ]
            "since" => array:2 [▼
              "attributes" => []
              "value" => "1966-08-22"
            ]
            "qualification" => array:2 [▼
              "attributes" => []
              "value" => "bold, brash and tomboyish"
            ]
          ]
          "attributes" => []
        ]
      ]
    ]
    "attributes" => []
  ]
]

What I want it to result to is (indentation could be wrong):

array:1 [▼
  "characters" => array:2 [▼
    "value" => array:2 [▼
      0 => [
        "character" => array:1 [▼
            "value" => array:4 [▼
              "name2" => array:2 [▼
                  "attributes" => []
                  "value" => "Sno"
                ]
                "friend-of" => array:2 [▼
                  "attributes" => []
                  "value" => "Pep"
                ]
                "since" => array:2 [▼
                  "attributes" => []
                  "value" => "1950-10-04"
                ]
                "qualification" => array:2 [▼
                  "attributes" => []
                  "value" => "extroverted beagle"
                ]
              ]
              "attributes" => []
            ]
          ]
        ]
        1 => array:2 [▼
          "character" => array:1 [▼
            "value" => array:4 [▼
              "name2" => array:2 [▼
                "attributes" => []
                "value" => "Pep"
              ]
              "friend-of" => array:2 [▼
                "attributes" => []
                "value" => "Sno"
              ]
              "since" => array:2 [▼
                "attributes" => []
                "value" => "1966-08-22"
              ]
              "qualification" => array:2 [▼
                "attributes" => []
                "value" => "bold, brash and tomboyish"
              ]
            ]
            "attributes" => []
          ]
        ]
      ]
    ]
    "attributes" => []
  ]
]

So basically, I want the characters key's value key to be an array of two items, which basically includes the 2 character keys. This is only to happen if there are many of the same element on the same branch. The way it currently is, where the character key is an array with 2 elements doesn't work in my situation.

Altering the method above to reflect my needs hasn't been possible for me yet and I'm not sure what kind of approach I should take. Altering an array like this from a DOMDocument instance seems quite complicated.

2 Answers 2

1

The problem is when to add in a new level and when to carry on with just adding the data. I've changed this logic, adding comments to the code to help understand what happens and when...

private function domNodeToArray($node, $xmlDocument)
{
    $output = [];
    switch ($node->nodeType)
    {
        case XML_CDATA_SECTION_NODE:
        case XML_TEXT_NODE:
            $output = trim($node->textContent);
            break;
        case XML_ELEMENT_NODE:
            for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++)
            {
                $child = $node->childNodes->item($i);
                $v = $this->domNodeToArray($child, $xmlDocument);

                if (isset($child->tagName))
                {
                    $t = $child->tagName;

//                     if (!isset($output['value'][$t]))
//                     {
//                         $output['value'][$t] = [];
//                     }
                    // If the element already exists
                    if (isset($output['value'][$t]))
                    {
                        // Copy the existing value to new level
                        $output['value'][] = [$t => $output['value'][$t]];
                        // Add in new value
                        $output['value'][] = [$t => $v];
                        // Remove old element
                        unset($output['value'][$t]);
                    }
                    // If this has already been added at a new level
                    elseif ( isset($output['value'][0][$t]))   
                    {
                        // Add it to existing extra level
                        $output['value'][] = [$t => $v];
                    }
                    else    {
                        $output['value'][$t] = $v;
                    }
                }
                else if ($v || $v === '0')
                {
                    $output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
                }
            }

            if (isset($output['value']) && $node->attributes->length && !is_array($output['value']))
            {
                $output = ['value' => $output['value']];
            }

            if (!$node->attributes->length && isset($output['value']) && !is_array($output['value']))
            {
                $output = ['attributes' => [], 'value' => $output['value']];
            }

            if ($node->attributes->length)
            {
                $a = [];
                foreach ($node->attributes as $attrName => $attrNode)
                {
                    $a[$attrName] = (string)$attrNode->value;
                }
                $output['attributes'] = $a;
            }
            else
            {
                $output['attributes'] = [];
            }
            break;
    }

    return $output;
}

I've tried it with...

<?xml version="1.0" encoding="UTF-8"?>
<characters>
   <character>
      <name2>Sno</name2>
      <friend-of>Pep</friend-of>
      <since>1950-10-04</since>
      <qualification>extroverted beagle</qualification>
   </character>
   <character>
      <name2>Pep</name2>
      <friend-of>Sno</friend-of>
      <since>1966-08-22</since>
      <qualification>bold, brash and tomboyish</qualification>
   </character>
   <character>
      <name2>Pep2</name2>
      <friend-of>Sno</friend-of>
      <since>1966-08-23</since>
      <qualification>boldish, brashish and tomboyish</qualification>
   </character>
</characters>

to check that the <character> elements are all added to the right level.

Sign up to request clarification or add additional context in comments.

Comments

1

I've done some changes to your function but I'm not sure if this is what you need.

private function domNodeToArray($node, $xmlDocument)
{
    $output = ['value' => [], 'attributes' => []];

    switch ($node->nodeType) {
    case XML_CDATA_SECTION_NODE:
    case XML_TEXT_NODE:
        $output = trim($node->textContent);
        break;
    case XML_ELEMENT_NODE:
        for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++) {
            $child = $node->childNodes->item($i);
            $v = $this->domNodeToArray($child, $xmlDocument);

            if (isset($child->tagName)) {
                $t = $child->tagName;

                if (isset($output['value'][$t])) {
                    $output['value'][] = [$t => $output['value'][$t]];
                    $output['value'][] = [$t => $v];
                    unset($output['value'][$t]);
                } else {
                    $output['value'][$t] = $v;
                }
            } elseif (($v && is_string($v)) || $v === '0') {
                $output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
            }
        }

        if ($node->attributes->length) {
            foreach ($node->attributes as $attrName => $attrNode) {
                $output['attributes'][$attrName] = (string) $attrNode->value;
            }
        }

        break;
    }

    return $output;
}

Output

array:1 [▼
  "characters" => array:2 [▼
    "value" => array:2 [▼
      0 => array:1 [▼
        "character" => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "value" => "Sno"
              "attributes" => []
            ]
            "friend-of" => array:2 [▼
              "value" => "Pep"
              "attributes" => []
            ]
            "since" => array:2 [▼
              "value" => "1950-10-04"
              "attributes" => []
            ]
            "qualification" => array:2 [▼
              "value" => "extroverted beagle"
              "attributes" => []
            ]
          ]
          "attributes" => []
        ]
      ]
      1 => array:1 [▼
        "character" => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "value" => "Pep"
              "attributes" => []
            ]
            "friend-of" => array:2 [▼
              "value" => "Sno"
              "attributes" => []
            ]
            "since" => array:2 [▼
              "value" => "1966-08-22"
              "attributes" => []
            ]
            "qualification" => array:2 [▼
              "value" => "bold, brash and tomboyish"
              "attributes" => []
            ]
          ]
          "attributes" => []
        ]
      ]
    ]
    "attributes" => []
  ]
]

6 Comments

This is almost what I need, but you have removed a lot of redundant code. Thanks! So, as I've said in the question itself, I only need that array wrapping if we're dealing with many elements of the same kind in the same branch/level, your method wraps everything into an array right now, even the keys inside the value of character. I'd really appreciate your continued help.
@Aborted Take a look at my updated. I've added an if before the array_push. Now you should get the right structure you've wanted.
Thanks again for the help! It works perfectly with the XML that I provided, however, with this other XML I tested it with, I'm getting Array to string conversion .
@Aborted Hmm, it looks like some child with no tagName issue. Updated the elseif.
Thanks again for the help! I'm a bit confused at the output within the DISPATCHNOTIFICATION_ITEM's value key. It has both numeric indexes and string keys. LINE_ITEM_ID is a direct key, while PRODUCT_ID and some others are contained within an indexed array. pastebin.com/3XBH8s30
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.