1

I have about 1000 xml files, and some are missing an element. I made a script to search through the xml's and print the element. I wanted to add the ability to add the element if it isnt there, but am unsuccessful.

As you can see below, the DVT3 isnt there and it needs to be added.

My Code

XMLParser = etree.XMLParser(remove_blank_text=True)
for f in os.listdir(directory):
    if f.endswith(".xml"):

        xmlfile = directory + '/' + f


        tree = etree.parse(xmlfile, parser=XMLParser)
        root = tree.getroot()

        hardwareRevisionNode = root.find(".//hardwareRevision")

        try:
            print f + ' :   ' + hardwareRevisionNode.text
        except Exception as e:
            print str(e)
            print xmlfile
            #Wearable = root.find(".//Wearable")
            ChildNode = etree.Element(".//Wearable")
            ChildNode.text = "DVT2"
            ChildNode.append(ChildNode)
            tree.write(xmlfile, pretty_print=True)

XML File

<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<Speech>
  <Results>
    <breakpoint name="ASR_START_RECOGNITION" elapsedTime="00:00:00.000" />
  </Results>
  <Meta>
    <Dialog>
      <sessionUUID>7c9b1e3a-b22f-4793-818f-72bc6e7b84a9</sessionUUID>
    </Dialog>
    <ASR>
      <engine>
        <name>Rockhopper</name>
        <version>1.0.0.61</version>
      </engine>
      <wrapper>
        <name>RockhopperAsrEngine</name>
        <version>1.8.2</version>
      </wrapper>
      <wrapper>
        <name>Core</name>
        <version>1.8.2</version>
      </wrapper>
      <resource>
        <name>Language Model</name>
        <version>1.4.4</version>
      </resource>
    </ASR>
    <Application>
      <name>FightClub</name>
      <version>0.1.550</version>
      <commit>8f7a411</commit>
      <buildDate>2016-03-09T18:16Z</buildDate>
      <branch>HEAD</branch>
    </Application>
    <Wearable>
      <firmware>1.0.183 - FCB1APP000-1611W0183</firmware>
    </Wearable>
  </Meta>
</Speech>

XML file I Want

<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<Speech>
  <Results>
    <breakpoint name="ASR_START_RECOGNITION" elapsedTime="00:00:00.000" />
  </Results>
  <Meta>
    <Dialog>
      <sessionUUID>7c9b1e3a-b22f-4793-818f-72bc6e7b84a9</sessionUUID>
    </Dialog>
    <ASR>
      <engine>
        <name>Rockhopper</name>
        <version>1.0.0.61</version>
      </engine>
      <wrapper>
        <name>RockhopperAsrEngine</name>
        <version>1.8.2</version>
      </wrapper>
      <wrapper>
        <name>Core</name>
        <version>1.8.2</version>
      </wrapper>
      <resource>
        <name>Language Model</name>
        <version>1.4.4</version>
      </resource>
    </ASR>
    <Application>
      <name>FightClub</name>
      <version>0.1.550</version>
      <commit>8f7a411</commit>
      <buildDate>2016-03-09T18:16Z</buildDate>
      <branch>HEAD</branch>
    </Application>
    <Wearable>
      <firmware>1.0.183 - FCB1APP000-1611W0183</firmware>
      <hardwareRevision>DVT3</hardwareRevision>
    </Wearable>
  </Meta>
</Speech>
0

1 Answer 1

1

You can try out xmltodict.

import xmltodict as x

with open(myfile) as f:

    xmlDictionary=x.parse(f.read(),'utf-8')
    xmlDictionary['Speech']['Meta']['Wearable'].update({"hardwareRevision": "DVT3"})

    output = x.unparse(xmlDictionary)

    with open(outfile,'w') as out:
        out.write(output)

Make it run in parallel if you want, and if storage is a concern simply replace contents of files (or delete old ones immediatelly after new ones are made).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.