1

I have a number of XML-files with a structure like this:

<titles>
  <title mode="example" name="name_example">
    <titleselect>
      <attribute_a>attrib_a</attribute_a>
      <attribute_b>attrib_b</attribute_b>
      <attribute_c>attrib_c</attribute_c>
      <sort_attribute>New York</sort_attribute>
    </titleselect>
  </title>
  <title mode="another_example" name="another_name">
    <titleselect>
      <attribute_a>attrib_a</attribute_a>
      <attribute_b>attrib_b</attribute_b>
      <attribute_c>attrib_c</attribute_c>
      <sort_attribute>Boston</sort_attribute>
    </titleselect>
  </title>
  <title mode="final_example" name="final_name">
    <titleselect>
      <attribute_a>attrib_a</attribute_a>
      <attribute_b>attrib_b</attribute_b>
      <attribute_c>attrib_c</attribute_c>
      <sort_attribute>Chicago</sort_attribute>
    </titleselect>
  </title>
</titles>

I am trying to sort the "titles" alphabetically by the "sort_attribute". My desired output is like this:

<titles>
      <title mode="another_example" name="another_name">
        <titleselect>
          <attribute_a>attrib_a</attribute_a>
          <attribute_b>attrib_b</attribute_b>
          <attribute_c>attrib_c</attribute_c>
          <sort_attribute>Boston</sort_attribute>
        </titleselect>
      </title>
      <title mode="final_example" name="final_name">
        <titleselect>
          <attribute_a>attrib_a</attribute_a>
          <attribute_b>attrib_b</attribute_b>
          <attribute_c>attrib_c</attribute_c>
          <sort_attribute>Chicago</sort_attribute>
        </titleselect>
      </title>
      <title mode="example" name="name_example">
        <titleselect>
          <attribute_a>attrib_a</attribute_a>
          <attribute_b>attrib_b</attribute_b>
          <attribute_c>attrib_c</attribute_c>
          <sort_attribute>New York</sort_attribute>
        </titleselect>
      </title>
    </titles>

Is there anyway to achieve this, preferably using XSLT or Python? I am completely new to the world of XSLT, but I have tried applying a number of solutions from other relevant questions e.g. XSLT sort parent element based on child element attribute to no avail.

1
  • 1
    Yes, this is easy. The answer you link to is excellent already, what was your attempt at applying it? Commented Nov 14, 2017 at 9:05

2 Answers 2

1

If you are still interested in a Python solution, it can be achieved by using ElementTree.

How it works:

  1. Getting all the title nodes
  2. Removing each one from the root node
  3. Sorting the title nodes in memory based on the sort_attribute tag
  4. Adding each title node back to the root element in the correct order


import xml.etree.ElementTree as ET


def get_sort_attribute_tag_value(node):
    return node.find('titleselect').find('sort_attribute').text

with open('test.xml') as f:
    xml_node = ET.fromstring(f.read())

title_nodes = xml_node.findall('title')

for title_node in title_nodes:
    xml_node.remove(title_node)

title_nodes.sort(key=get_sort_attribute_tag_value)

for title_node in title_nodes:
    xml_node.append(title_node)

print(ET.tostring(xml_node).decode())

# in order to save as a new file
with open('new_file.xml', 'w') as f:
    f.write(ET.tostring(xml_node).decode())

Outputs:

<titles>
    <title mode="another_example" name="another_name">
        <titleselect>
            <attribute_a>attrib_a</attribute_a>
            <attribute_b>attrib_b</attribute_b>
            <attribute_c>attrib_c</attribute_c>
            <sort_attribute>Boston</sort_attribute>
        </titleselect>
    </title>
    <title mode="final_example" name="final_name">
        <titleselect>
            <attribute_a>attrib_a</attribute_a>
            <attribute_b>attrib_b</attribute_b>
            <attribute_c>attrib_c</attribute_c>
            <sort_attribute>Chicago</sort_attribute>
        </titleselect>
    </title>
    <title mode="example" name="name_example">
        <titleselect>
            <attribute_a>attrib_a</attribute_a>
            <attribute_b>attrib_b</attribute_b>
            <attribute_c>attrib_c</attribute_c>
            <sort_attribute>New York</sort_attribute>
        </titleselect>
    </title>
</titles>
Sign up to request clarification or add additional context in comments.

1 Comment

Is there anyway to write the result to a new XML-file?
0

As an XSLT alternative, as per Tomalek's comment, this is fairly straightforward using a template capturing the parent titles and then sorting by the required sort_attribute (actually, an element), and copying the inner title content:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

  <!-- identity transform -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="titles">
    <xsl:copy>
      <xsl:apply-templates select="title">
        <xsl:sort select="titleselect/sort_attribute" data-type="text" order="ascending"/>
      </xsl:apply-templates>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

1 Comment

Thank you very much, this solved my problem completely! I think the match attribute was what prevented my previous attempts from succeeding.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.