2

I was successfully using the following code to merge multiple large XML files into a new (larger) XML file. Found at least part of this on StackOverflow

   $docList = new DOMDocument();

    $root = $docList->createElement('documents');
    $docList->appendChild($root);

    $doc = new DOMDocument();

    foreach(xmlFilenames as $xmlfilename) {

        $doc->load($xmlfilename);

        $xmlString = $doc->saveXML($doc->documentElement);

        $xpath = new DOMXPath($doc);
        $query = self::getQuery();  // this is the name of the ROOT element

        $nodelist = $xpath->evaluate($query, $doc->documentElement);

        if( $nodelist->length > 0 ) {

            $node = $docList->importNode($nodelist->item(0), true);

            $xmldownload = $docList->createElement('document');

            if (self::getShowFileName())
                $xmldownload->setAttribute("filename", $filename);

            $xmldownload->appendChild($node);

            $root->appendChild($xmldownload);
        }

    }

$newXMLFile = self::getNewXMLFile();
$docList->save($newXMLFile);

I started running into OUT OF MEMORY issues when the number of files grew as did the size of them.

I found an article here which explained the issue and recommended using XMLWriter

So, now trying to use PHP XMLWriter to merge multiple large XML files together into a new (larger) XML file. Later, I will execute xpath against the new file.

Code:

$xmlWriter = new XMLWriter();
$xmlWriter->openMemory();
$xmlWriter->openUri('mynewFile.xml');
$xmlWriter->setIndent(true);
$xmlWriter->startDocument('1.0', 'UTF-8');

$xmlWriter->startElement('documents');

$doc = new DOMDocument();

foreach($xmlfilenames as $xmlfilename) 
{
    $fileContents = file_get_contents($xmlfilename);
    $xmlWriter->writeElement('document',$fileContents);
}

$xmlWriter->endElement();
$xmlWriter->endDocument();
$xmlWriter->flush();

Well, the resultant (new) xml file is no longer correct since elements are escaped - i.e. <?xml version="1.0" encoding="UTF-8"?>

&lt;CONFIRMOWNX&gt;
&lt;Confirm&gt;
&lt;LglVeh id=&quot;GLE&quot;&gt;
&lt;AddrLine1&gt;GLEACHER &amp;amp; COMPANY&lt;/AddrLine1&gt;
&lt;AddrLine2&gt;DESCAP DIVISION&lt;/AddrLine2&gt;

Can anyone explain how to take the content from the XML file and write them properly to new file?

I'm burnt on this and I KNOW it'll be something simple I'm missing.

Thanks. Robert

2 Answers 2

4

See, the problem is that XMLWriter::writeElement is intended to, well, write a complete XML element. That's why it automatically sanitize (replace & with &amp;, for example) the contents of what's been passed to it as the second param.

One possible solution is to use XMLWriter::writeRaw method instead, as it writes the contents as is - without any sanitizing. Obviously it doesn't validate its inputs, but in your case it does not seem to be a problem (as you're working with already checked source).

Sign up to request clarification or add additional context in comments.

1 Comment

XMLWriter::writeRaw did the trick for me. Tried it once but used it incorrectly since it only takes one parameter - i.e. XMLWriter::writeRaw (string $content). Thanks!
-2

Hmm, Not sure why it's converting it to HTML Characters, but you can decode it like so

htmlspecialchars_decode($data);

It converts special HTML entities back to characters.

1 Comment

tried $xmlWriter->writeElement('document', htmlspecialchars_decode($fileContents)); -- no luck

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.