2

I have this XML file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns3:BOX xmlns="urn:loc.gov:item" 
         xmlns:ns2="urn:loc.gov:box" 
         xmlns:ns3="http://www.example.com/inverter" 
         xmlns:ns4="urn:loc.gov:xyz">
    <ns3:Item>
        <Description>ITEM1</Description>
        <PackSizeNumeric>6</PackSizeNumeric>
        <ns2:BuyersItemIdentification>
            <ID>75847589</ID>
        </ns2:BuyersItemIdentification>
        <ns2:CommodityClassification>
            <CommodityCode>856952</CommodityCode>
        </ns2:CommodityClassification>
        <ns2:AdditionalItemProperty>
            <Name>Weight</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:AdditionalItemProperty>
            <Name>Tare</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:ManufacturerParty>
            <ns2:PartyIdentification>
                <ID>847532</ID>
            </ns2:PartyIdentification>
        </ns2:ManufacturerParty>
    </ns3:Item>
    <ns3:Item>
        <Description>ITEM2</Description>
        <PackSizeNumeric>10</PackSizeNumeric>
        <ns2:BuyersItemIdentification>
            <ID>9568475</ID>
        </ns2:BuyersItemIdentification>
        <ns2:CommodityClassification>
            <CommodityCode>348454</CommodityCode>
        </ns2:CommodityClassification>
        <ns2:AdditionalItemProperty>
            <Name>Weight</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:AdditionalItemProperty>
            <Name>Tare</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:ManufacturerParty>
            <ns2:PartyIdentification>
                <ID>7542125</ID>
            </ns2:PartyIdentification>
        </ns2:ManufacturerParty>
    </ns3:Item>
</ns3:BOX>

I'm trying to convert it to a CSV file.

I get the content:

[xml]$inputFile = Get-Content test.xml

Then I export to CSV:

$inputfile.BOX.childnodes | Export-Csv "Stsadm-EnumSites.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8

I get the Description and PackSizeNumeric fields but not the other fields which are in :

"Description";"PackSizeNumeric";"BuyersItemIdentification";"CommodityClassification";"AdditionalItemProperty";"ManufacturerParty"
"ITEM1";"6";"System.Xml.XmlElement";"System.Xml.XmlElement";"System.Object[]";"System.Xml.XmlElement"
"ITEM2";"10";"System.Xml.XmlElement";"System.Xml.XmlElement";"System.Object[]";"System.Xml.XmlElement"

Which is the best way to obtain the fields that are contained in other namespaces?

I would like to get this

"Description";"PackSizeNumeric";"BuyersItemIdentification";"CommodityClassification";"Weight";"Tare";PartyIdentification
"ITEM1";"6";"75847589";"856952";"0";"0";"847532"
"ITEM2";"10";"9568475";"348454";"0";"0";"7542125"
6
  • Show desired results. Commented Jun 1, 2015 at 14:15
  • I corrected my question Commented Jun 1, 2015 at 14:43
  • 1
    Your problem does not related to XML namespaces. It is Export-Csv having troubles in converting complex objects to text. It is just coincidence that all complex elements have namespace in your XML. Commented Jun 1, 2015 at 14:59
  • yes, you need to iterate. however it does seem that to ease iteration, you will need a namespace manager Commented Jun 1, 2015 at 15:04
  • @PetSerAl Can you help me to find the right direction? Can I try something else? Commented Jun 1, 2015 at 15:05

2 Answers 2

3

A combination of Select-Object and Select-Xml seems to work pretty well:

$ns = @{
    item="urn:loc.gov:item"
    ns2="urn:loc.gov:box"
    ns3="http://www.example.com/inverter"
    ns4="urn:loc.gov:xyz"
}

$doc = New-Object xml
$doc.Load("test.xml")

$doc.BOX.ChildNodes | Select-Object -Property `
    Description,`
    PackSizeNumeric, `
    @{Name="BuyersItemIdentification_ID"; Expression={$_.BuyersItemIdentification.ID}}, `
    @{Name="CommodityClassification_CommodityCode"; Expression={$_.CommodityClassification.CommodityCode}}, `
    @{Name="Weight"; Expression={Select-Xml -Namespace $ns -Xml $_ -XPath "./ns2:AdditionalItemProperty[item:Name = 'Weight']/item:Value"}}, `
    @{Name="Tare"; Expression={Select-Xml -Namespace $ns -Xml $_ -XPath "./ns2:AdditionalItemProperty[item:Name = 'Tare']/item:Value"}}, `
    @{Name="ManufacturerParty_ID"; Expression={$_.ManufacturerParty.PartyIdentification.ID}} `
| Export-Csv "Stsadm-EnumSites.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8

result (Stsadm-EnumSites.csv)

"Description";"PackSizeNumeric";"BuyersItemIdentification_ID";"CommodityClassification_CommodityCode";"Weight";"Tare";"ManufacturerParty_ID"
"ITEM1";"6";"75847589";"856952";"0";"0";"847532"
"ITEM2";"10";"9568475";"348454";"0";"0";"7542125"
Sign up to request clarification or add additional context in comments.

2 Comments

cool beans! much better then what I was trying to do iterating through the childnodes.
Thanks @Vincent :) It still feels a bit cumbersome, but at least it's very explicit in terms of column naming and node selection.
1

Tomalak's answer is succinct and seems best solution for the problem at hand.

I was trying to make something generic, but the result is not even in the format requested (the additional properties list is hard to convert in a generic way, fieldnames are clunky). Anyway, the below solution walks down the XML tree flattening the data. It is not bound by the element names (except for the initial select)

After finishing my generic answer, I'm now wondering if it wouldn't be better to write & apply an XSLT transformation.

#[xml]$xml = Get-Content test.xml
#xml to process
$xml = [xml]@"
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns3:BOX xmlns="urn:loc.gov:item" 
         xmlns:ns2="urn:loc.gov:box" 
         xmlns:ns3="http://www.example.com/inverter" 
         xmlns:ns4="urn:loc.gov:xyz">
    <ns3:Item>
        <Description>ITEM1</Description>
        <PackSizeNumeric>6</PackSizeNumeric>
        <ns2:BuyersItemIdentification>
            <ID>75847589</ID>
        </ns2:BuyersItemIdentification>
        <ns2:CommodityClassification>
            <CommodityCode>856952</CommodityCode>
        </ns2:CommodityClassification>
        <ns2:AdditionalItemProperty>
            <Name>Weight</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:AdditionalItemProperty>
            <Name>Tare</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:ManufacturerParty>
            <ns2:PartyIdentification>
                <ID>847532</ID>
            </ns2:PartyIdentification>
        </ns2:ManufacturerParty>
    </ns3:Item>
    <ns3:Item>
        <Description>ITEM2</Description>
        <PackSizeNumeric>10</PackSizeNumeric>
        <ns2:BuyersItemIdentification>
            <ID>9568475</ID>
        </ns2:BuyersItemIdentification>
        <ns2:CommodityClassification>
            <CommodityCode>348454</CommodityCode>
        </ns2:CommodityClassification>
        <ns2:AdditionalItemProperty>
            <Name>Weight</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:AdditionalItemProperty>
            <Name>Tare</Name>
            <Value>0</Value>
        </ns2:AdditionalItemProperty>
        <ns2:ManufacturerParty>
            <ns2:PartyIdentification>
                <ID>7542125</ID>
            </ns2:PartyIdentification>
        </ns2:ManufacturerParty>
    </ns3:Item>
</ns3:BOX>
"@

$nsm = [Xml.XmlNamespaceManager]$xml.NameTable

$nsm.AddNamespace("ns1","urn:loc.gov:item")
$nsm.AddNamespace("ns2","urn:loc.gov:box")
$nsm.AddNamespace("ns3","http://www.example.com/inverter")
$nsm.AddNamespace("ns4","urn:loc.gov:xyz")

#function to recursively flatten xml subtree into a hashtable (passed in)
function flatten-xml {
  param (
    $Parent,
    $Element,
    $Fieldname,
    $HashTable
  )

  if ($parent -eq "") {
    $label = $fieldname
  } else {
    $label = $parent + "_" + $fieldname 
  }

  #write-host "$label is $($element.GetType())"

  if ($element.GetType() -eq [System.Xml.XmlElement]) { 
    #get property fields

    $element | Get-Member | ? { $_.MemberType -eq "Property" } | % {
      #write-host "moving from $label to $($_.Name)"
      flatten-xml -Parent $label -Element $element.($_.Name) -FieldName $_.Name -HashTable $HashTable
    }
  }elseif($element.GetType() -eq [System.Object[]]) { 
    #write-host "$label is an array"
    $i = 0
    $element | % { flatten-xml -Parent $label -Element $_ -FieldName "item$i" -HashTable $HashTable; $i++ }
  }else {
    $HashTable[$label] = $element
  }
 }

#convert the nodecollection returned by xpath query into hashtables and write them out to CSV
$xml.SelectNodes("//ns3:BOX/ns3:Item",$nsm) | % { 
    $element = $_
    $ht = @{}
    $element | Get-Member | ? { $_.MemberType -eq "Property" } | % {
      flatten-xml -Parent "" -Element $element.($_.Name) -FieldName $_.Name -HashTable $ht 
    }

    [PSCustomObject]$ht
}  | Export-Csv "test2.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8

Result:

> gc .\test2.csv

"AdditionalItemProperty_item0_Name";"AdditionalItemProperty_item0_Value";"AdditionalItemProperty_item1_Name";"AdditionalItemProperty_item1_Value";"BuyersItemIdentification_ID";"CommodityClassification_CommodityCode";"Description";"ManufacturerParty_PartyIdentification_ID";"PackSizeNumeric"
"Weight"                            ;"0"                                  ;"Tare"                              ;"0"                                  ;"75847589"                   ;"856952"                               ;"ITEM1"      ;"847532"                                  ;"6"
"Weight"                            ;"0"                                  ;"Tare"                              ;"0"                                  ;"9568475"                    ;"348454"                               ;"ITEM2"      ;"7542125"                                 ;"10"

References:

2 Comments

Your solution is more complex but it is interesting for my xml study. There are two errors :-) 1) the order of the fields. Description,PackSizeNumeric, ecc ecc 2) weight and tare are the name of columns :-)
it's ok. Regarding your comment: 1. Order - seems hashtable changed the order of the keys (element names), I haven't tried finding a way to respect the order and 2. Weight & Tare are 2 elements in an array, they are not columns by themselves in the XML structure you gave. This is what I highlighted in my description, I feel you can't transpose this list of values into columns in a generic way.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.