3

I am reading an XML file and take part of it and write it to YAML file. For example, in this xml file,

<project>


  <scm class="hudson.scm.NullSCM"/>
  <assignedNode>python</assignedNode>
  <canRoam>false</canRoam>
  <disabled>false</disabled>
  <blockBuildWhenDownstreamBuilding>false</blockBuildWhenDownstreamBuilding>
  <blockBuildWhenUpstreamBuilding>false</blockBuildWhenUpstreamBuilding>
  <triggers>
    <hudson.triggers.TimerTrigger>
      <spec>H * * * *</spec>
    </hudson.triggers.TimerTrigger>
  </triggers>
  <concurrentBuild>false</concurrentBuild>
  <builders>

I want to read only the disabled value and the spec value and write it to a YAML file like this: Expected output:

disabled: 'false'
name: Cancellation_CMT_Tickets
triggers:
  hudson.triggers.TimerTrigger:
    spec: H * * * *

Only when my resultant dictionary is in this format

d = {"trigger":{"hudson.triggers.TimerTrigger": {"spec": "H * * * *"}}}

I can dump that into yaml file with the above format. MY current code looks like this, search key is passed as runtime arguments

import os, xml.etree.ElementTree as ET
import yaml,sys
tree = ET.parse('test.xml')
root = tree.getroot()

d = {}
def xmpparse(root,searchkey):
    for child in root:
        if child.tag == searchkey:
            d[child.tag]=child.text
        elif len(child):
           xmpparse(child,searchkey)
for i in sys.argv:
    xmpparse(root,i)

print(yaml.dump(d, default_flow_style=False))

Current output:

disabled: 'false'
spec: H * * * *

Any help would be much appreciated. Thanks in advance!

3
  • Don't know much (or anything...) about YAML, but I can get your dictionary to its proper format (using the lxml library), if that helps. Commented Jul 20, 2019 at 19:11
  • I want to have a nested dictionary for only the entries I need to grab from the XML. Can you help me with that? While doing the recursive operation, it should start putting the values in nested dictionary Commented Jul 20, 2019 at 20:39
  • Let me know if the answer below works. Commented Jul 20, 2019 at 21:12

1 Answer 1

1

I believe this should take care of the nested dictionary problem, at least; it's based on various answers on SO on how to form nested dictionaries (and there may be other methods):

    import lxml.html as LH

    class NestedDict(dict):
        def __missing__(self, key):
              self[key] = NestedDict()
              return self[key]

    data =     [your xml above]

    doc = LH.fromstring(data)

    for i in doc:
           if i.tag == 'triggers':
                for child in i.getchildren():
                    d = NestedDict()
                    d[i.tag][child.tag][child[0].tag] = i.text_content().strip()

    print(d)

Output:

{'triggers': {'hudson.triggers.timertrigger': {'spec': 'H * * * *'}}}

Sign up to request clarification or add additional context in comments.

1 Comment

I forgot to mention something, my code should work in such a way that, it can go any depth for the searchKey passed as runtime arguments. It looks like your solution can just work for this XML file. I didn't try that yet though. But d[i.tag][child.tag][child[0].tag] here you are just making entry with 3 keys

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.