0

I am parsing an XML file from php through simplexml_load_file function. They have multiple namespace in it.

    <?xml version="1.0" encoding="utf-8"?>
    <RETURN:RT xmlns:FORMM="http://example.com/5"
        xmlns:Form="http://example.com/master"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns:RETURN="http://example.com">
   <FORMM:5F>
<Form:Info>
    <Form:SWCreatedBy>SW10002087</Form:SWCreatedBy>
</Form:Info>
</FORMM:5F>
</RETURN:RT>

Here is my PHP Code :

$xml = simplexml_load_file('filename');
        $namespaces = $xml->getNamespaces(true);
        foreach ($namespaces as $key =>  $extension) {
            
            $xml->registerXPathNamespace($key, $extension);
        }

         $main = $xml->xpath('//RETURN:RT/*');

        foreach ($main as $itr) {
           echo '<pre>';
            print_r($itr);
            echo '</pre>';
          }

Output:

 SimpleXMLElement Object
(
)

But when i remove the Form from namespace then it gives me the ouput

SimpleXMLElement Object
(
    [Form:Info] => SimpleXMLElement Object
        (
            [Form:SWCreatedBy] => SW10002087
        )

)

Can any give me solution regarding this i don't have to remove the form every time from xml file.

2

1 Answer 1

1

5F is an invalid tag name - they are not allowed to start with digits. The parser will throw warnings if you load this (invalid) XML.

Warning: simplexml_load_string(): namespace error : Failed to parse QName 'FORMM:' in /in/HHTnS on line 33

Warning: simplexml_load_string():     <FORMM:5F> in /in/HHTnS on line 33

Warning: simplexml_load_string():            ^ in /in/HHTnS on line 33

So you will need some error handling for this and/or have the XML fixed.

However LibXML (the library behind SimpleXML and DOM) tolerates it. So you can still read it.

Don't fetch namespaces from the document. This is method for debugging or generic transformations. The namespace URI is the identifying value of a namespace. Namespaces prefixes can change at any element in an XML. Define/register you own prefixes for the Xpath expressions and use the Namespace URI in the method calls.

print_r(), var_dump(), ... are using SimpleXMLs mapping of XML elements. This is a compromise and has limits (especially with namespaces). Try SimpleXMLElement::asXML() to debug a SimpleXMLElement instance.

You can still access the elements using property syntax, but you might need to use an SimpleXMLElement::children() to specify the namespace.

Here is an example based on your question:

$xml = <<<'XML'
<?xml version="1.0" encoding="utf-8"?>
<RETURN:RT 
    xmlns:FORMM="http://example.com/5"
    xmlns:Form="http://example.com/master"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:RETURN="http://example.com">
    <FORMM:5F>
        <Form:Info>
            <Form:SWCreatedBy>SW10002087</Form:SWCreatedBy>
        </Form:Info>
   </FORMM:5F>
</RETURN:RT>
XML;

// define namespaces you're going to use
$xmlns = [
  'five' => 'http://example.com/5',
  'master' => 'http://example.com/master',
  'return' => 'http://example.com',
];

// a function to apply all of them to a SimpleXMLElement
function registerXpathNamespaces(SimpleXMLElement $target, array $xmlns) {
    foreach ($xmlns as $prefix => $namespaceURI) {
        $target->registerXpathNamespace($prefix, $namespaceURI);
    }
}

$rt = simplexml_load_string($xml);
registerXpathNamespaces($rt, $xmlns);
$list = $rt->xpath('//return:RT/*');
foreach ($list as $fiveF) {
    // serialize the node into a string for debugging
   var_dump($fiveF->asXML());
   // use the namespace list to access child elments in a specific namespace
   var_dump(
       (string)$fiveF->children($xmlns['master'])->Info->SWCreatedBy
   );
}

Output:

string(130) "<FORMM:5F>
        <Form:Info>
            <Form:SWCreatedBy>SW10002087</Form:SWCreatedBy>
        </Form:Info>
   </FORMM:5F>"
string(10) "SW10002087"

You need to register the namespaces on each SimpleXMLElement instance you are calling xpath() on.

In DOM, here is a separate Xpath object, so you only need to register the namespaces once. Additionally DOMXpath::evaluate() can return scalar values (not just node lists). This it how the example would look with DOM:

// define namespaces you're going to use
$xmlns = [
  'five' => 'http://example.com/5',
  'master' => 'http://example.com/master',
  'return' => 'http://example.com',
];

$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
foreach ($xmlns as $prefix => $namespaceURI) {
    $xpath->registerNamespace($prefix, $namespaceURI);
}

$list = $xpath->evaluate('//return:RT/*');
foreach ($list as $fiveF) {
    // serialize the node into a string for debugging
   var_dump($document->saveXML($fiveF));
   // directly fetch the value using a string cast in Xpath
   var_dump(
       $xpath->evaluate(
           'string(master:Info/master:SWCreatedBy)', 
           $fiveF
       )
   );
}

Using DOM you can take a look how the parser reads the broken tag.

// the RETURN:RT is valid - the parser can match it to the namespace
var_dump($document->documentElement->localName, $document->documentElement->namespaceURI);
foreach ($list as $fiveF) {
   // the "5F" or "FORMM:5F" is invalid - the parser will not match it
   var_dump($fiveF->localName, $fiveF->namespaceURI);
}

string(2) "RT"
string(18) "http://example.com"
string(8) "FORMM:5F"
NULL

So it keeps the alias/prefix as a part of the name and ignores the namespace.

Sign up to request clarification or add additional context in comments.

2 Comments

i want to upload the xml file through the from into database. and the user will upload it so i have not the access to change this xml file
The parser will tolerate 5F - but it is invalid XML. I modified my examples to use the broken XML.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.