6

I have a directory of very large XML files with a structure as this:

file1.xml:

<root>
 <EmployeeInfo attr="one" />
 <EmployeeInfo attr="two" />
 <EmployeeInfo attr="three" />
</root>

file2.xml:

<root>
 <EmployeeInfo attr="four" />
 <EmployeeInfo attr="five" />
 <EmployeeInfo attr="six" />
</root>

Now I am looking for a simple way to merge these files (*.xml) files into one output file:

<root>
 <EmployeeInfo attr="one" />
 <EmployeeInfo attr="two" />
 <EmployeeInfo attr="three" />
 <EmployeeInfo attr="four" />
 <EmployeeInfo attr="five" />
 <EmployeeInfo attr="six" />
</root>

I was thinking about using pure XSLT such as this one:

<xsl:transform version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <Container>
      <xsl:copy-of select="document('file1.xml')"/>
      <xsl:copy-of select="document('file2.xml')"/>        
    </Container>
  </xsl:template>
</xsl:stylesheet>

This works but isn't as flexible as I want. Being a novice with PowerShell (version 2) eager to learn new best pracctices of working with XML in PowerShell I am wondering what is the simplest, purest PowerShell way of merging the structre of XML documents into one?

Cheers, Joakim

2 Answers 2

11

While the XSLT way to do this is pretty short, so is the PowerShell way:

$finalXml = "<root>"
foreach ($file in $files) {
    [xml]$xml = Get-Content $file    
    $finalXml += $xml.InnerXml
}
$finalXml += "</root>"
([xml]$finalXml).Save("$pwd\final.xml")

Hope this helps,

Sign up to request clarification or add additional context in comments.

3 Comments

In case very large XML files are really large, that will consume large amount of memory and possibly could end up with OutOfMemoryException.
Thanks, I'll try this as a quick fix!
in your example, line 4 should be $finalXml += $xml.root.InnerXml Except that, works like a charm :)
2

Personally I would not use PowerShell for such a task.

Typically you use PowerShell to accessing config files like this

$config = [xml](gc web.config)

then you can work with the xml like with objects. Pretty cool. If you need to process large xml structures, then using [xml] (which is equivalent to XmlDocument) is quite memory expensive.

However, that's almost everything how PowerShell supports xml (get-command *xml* -CommandType cmdlet will give you all xml like commands).
It is of course possible to use .NET classes for xml operations, but that code won't be as pretty as true PowerShell approach. So, for your task you would need to use some readers/writers for that, which is imho not worthy doing.

That's why I think xslt is better approach ;) If you need to be flexible, you can generate the xlst template during script execution or just replace the file names, that's no problem.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.