0

My goal is to Recursively create an XML File based on another XML File.
Sample Input Data as below (Original Files are ~ 10MB)

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Picture Name="Template">
  <Children>
    <Shape Name="GroupLevel" ClassName="Group">
      <Properties>
        <Property Name="Property1" Value="Value1" />
        <Property Name="Property2" Value="Value2" />
        <Property Name="Property3" Value="Value3" />
      </Properties>
      <ContainedObjects>
        <Shape Name="Group1" ClassName="Group">
          <Properties>
            <Property Name="Property1" Value="Value1" />
            <Property Name="Property2" Value="Value2" />
            <Property Name="Property3" Value="Value3" />
          </Properties>
          <ContainedObjects>
            <Shape Name="Glue123" ClassName="Text" />
            <Shape Name="Variable1" ClassName="Variable" />
            <Shape Name="Variable2" ClassName="Variable" />
            <Shape Name="Variable3" ClassName="Variable" />
            <Shape Name="Group2" ClassName="Group">
              <Properties>
                <Property Name="Property1" Value="Value1" />
                <Property Name="Property2" Value="Value2" />
                <Property Name="Property3" Value="Value3" />
              </Properties>
              <ContainedObjects>
                <Shape Name="Group3" ClassName="RoundRect" />
              </ContainedObjects>
            </Shape>
          </ContainedObjects>
        </Shape>
      </ContainedObjects>
    </Shape>
  </Children>
</Picture>

  1. Need to Loop Inside Child tag. During the Loop, if I find the Shape tag Whose Parent is Children, Create a Node with that Tag Name in a blank Root.
  2. Ignore Properties / Property Nodes
  3. If I find Shape tag Whose Parent is ContainedObjects, Create a Node ContainedObjects while maintaining the depth, add the Shape Tag Inside that ContainedObjects Node.
  4. The actual operations involve creating a Different set of Nodes based on each Shape Tag, Which will be done at a later stage.
    For now, I am stuck at the recursive looping.

<Children>
  <Shape Name="GroupLevel" ClassName="Group">
    <ContainedObjects>
      <Shape Name="Group1" ClassName="Group">
        <ContainedObjects>
          <Shape Name="Glue123" ClassName="Text" />
          <Shape Name="Variable1" ClassName="Variable" />
          <Shape Name="Variable2" ClassName="Variable" />
          <Shape Name="Variable3" ClassName="Variable" />
          <Shape Name="Group2" ClassName="Group">
            <ContainedObjects>
              <Shape Name="Group3" ClassName="RoundRect" />
            </ContainedObjects>
          </Shape>
        </ContainedObjects>
      </Shape>
    </ContainedObjects>
  </Shape>
</Children>

Problem Code - This Doesn't Work, I tried recursively looping, but it doesn't hold the depth/nesting of objects. Please share any pointers.

def RecurseXML(Data_Tree):
    for Children in Data_Tree.iterchildren():
        if Children.tag == "Shape":
            Owner       = Children.getparent().tag
            Owner       = etree.SubElement(Root_Child,Owner)
            SomeValue   = etree.SubElement(Owner,Children.tag,attrib={"Name":Children.attrib['Name']})
            Root_Child.append(SomeValue)
        if len(Children) > 0:
            RecurseXML(Children)

1 Answer 1

1

XSLT is good at that:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>
  
  <xsl:template match="*">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="Children | Shape | ContainedObjects">
    <xsl:copy>
      <xsl:copy-of select="@*"/>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

lxml allows you to run XSLT.

Sign up to request clarification or add additional context in comments.

2 Comments

Hey Martin, Could you please point me towards a good source of learning/building XSL Templates? This might resolve a lot of Transformations I need to work on!
There are lots of books and courses, cranesoftwrights.github.io/books/ptux/index.htm is available for free, the author also has a video course on Udemy, although there you need to check which parts or hours are for free. stackoverflow.com/tags/xslt/info has a larger list of resources for learning, coding and practising.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.