0
import os
from xml.etree import ElementTree as ET
# files are in a sub folder where this script is being ran
path = "attachments"
for filename in os.listdir(path):
    # Only get xml files
    if not filename.endswith('.xml'): continue
    # I haven't been able to get it to work by just say
    fullname = os.path.join(path, filename)
    # This joins the path for each file it files so that python knows t
    tree = ET.parse(fullname)
    # Parse the files..
    print(tree)
    # Get the root of the XML tree structure
    root = tree.getroot()
    # Print the tags it finds from all the child elements from root
for child in root:
    print(child.tag, child.text)

I have the REST message built. I have figured out how to parse the XML for a single file.

What I cannot figure out is how to parse multiple files in a directory. In other words, I want to iterate through each file, parse the XML, and pass certain elements from the XML into a REST post message.

I've tried everything I could find on the internet, searching and trying for the last two days. Nothing seems to work, or I'm just doing it wrong... :\

In my comments, I explain what I believe is happening. You can see where I'm I'm saying for filename in the path I give it, parse the xml files and print the tags / text.. As it's written, it confirms that I do in fact have six objects (that's how many *.xml files are in that DIR). Then it prints all the elements and text for one, which is actually the middle file (4th file).

Here is the exact output I see, minus some sensitive data.

<xml.etree.ElementTree.ElementTree object at 0x0000018EF60E7608>
<xml.etree.ElementTree.ElementTree object at 0x0000018EF60DFE08>
<xml.etree.ElementTree.ElementTree object at 0x0000018EF62B1B08>
<xml.etree.ElementTree.ElementTree object at 0x0000018EF62E3F48>
<xml.etree.ElementTree.ElementTree object at 0x0000018EF629B608>
<xml.etree.ElementTree.ElementTree object at 0x0000018EF62B6988>

NUMBER 20514218
PARENT

STATUS 1-CLOSED
OPEN_DATE 12/01/2017 00:34:35
CLOSE_DATE 12/05/2017 17:48:28
SOURCE Self Service
PROCESS HR INTERNAL REQUEST FORM
CATEGORY HR Connect
SUB_CATEGORY Personnel Action Change/Update
USER_ID *sensitive information*
LAST_NAME *sensitive information*
FIRST_NAME Brandon
SITUATION SELECT...
PRIORITY 5 Days
ADVISOR_NAME ROMAN *sensitive information*
TEAM *sensitive information*
NEXT_ACTION

PROCESS_STATUS Verified
TRANSFERT_DATE

DEADLINE 12/12/2017 17:18:03
QUEUE HR Internal Request
FROZEN_DATE

OTHER_EMPLOYEE_ID *sensitive information*
REQUEST *sensitive information*

HISTORY_RESPONSE *sensitive information*
FINAL_RESPONSE *sensitive information*

-------------------here's the raw XML----------------------
<?xml version="1.0" encoding="UTF-8"?>
<CASE>
  <NUMBER>20514218</NUMBER>
  <PARENT>
  </PARENT>
  <STATUS>1-CLOSED</STATUS>
  <OPEN_DATE>12/01/2017 00:34:35</OPEN_DATE>
  <CLOSE_DATE>12/05/2017 17:48:28</CLOSE_DATE>
  <SOURCE>Self Service</SOURCE>
  <PROCESS>HR INTERNAL REQUEST FORM</PROCESS>
  <CATEGORY>HR Connect</CATEGORY>
  <SUB_CATEGORY>Personnel Action Change/Update</SUB_CATEGORY>
  <USER_ID>*sensitive information*</USER_ID>
  <LAST_NAME>*sensitive information*</LAST_NAME>
  <FIRST_NAME>*sensitive information*</FIRST_NAME>
  <SITUATION>SELECT...</SITUATION>
  <PRIORITY>5 Days</PRIORITY>
  <ADVISOR_NAME>ROMAN *sensitive information*</ADVISOR_NAME>
  <TEAM>2 HR SRV CNTR PA</TEAM>
  <NEXT_ACTION>
  </NEXT_ACTION>
  <PROCESS_STATUS>Verified</PROCESS_STATUS>
  <TRANSFERT_DATE>
  </TRANSFERT_DATE>
  <DEADLINE>12/12/2017 17:18:03</DEADLINE>
  <QUEUE>HR Internal Request</QUEUE>
  <FROZEN_DATE>
  </FROZEN_DATE>
  <OTHER_EMPLOYEE_ID>*sensitive information*</OTHER_EMPLOYEE_ID>
  <REQUEST>*sensitive information*</REQUEST>
  <HISTORY_RESPONSE>*sensitive information*</HISTORY_RESPONSE>
  <FINAL_RESPONSE>*sensitive information*</FINAL_RESPONSE>
</CASE>
5
  • Your for loop is probably indented incorrectly. Try indenting it on the same line as your root = tree.getroot(). Commented Jul 23, 2019 at 23:50
  • That was the issue! Thank you Jack Fleeting! Commented Jul 24, 2019 at 3:23
  • Glad it worked for you! Commented Jul 24, 2019 at 10:45
  • @Jack Fleeting, how do I give you credit for this? I want to make sure you get the answer... I don't see how to do it. Obviously I'm pretty new to Stack. I've spent the majority of my developer career in ServiceNow (JavaScript). Just starting to branch out. Commented Jul 30, 2019 at 21:56
  • Don't worry about it - next time you see a question or answer of mine, just upvote it :) Commented Jul 30, 2019 at 23:31

1 Answer 1

1
import os
from xml.etree import ElementTree as ET
# files are in a sub folder where this script is being ran
path = "attachments"
for filename in os.listdir(path):
    # Only get xml files
    if not filename.endswith('.xml'): continue
    # I haven't been able to get it to work by just saying 'if filename.endswith('.xml')' only if not..
    fullname = os.path.join(path, filename)
    # This joins the path for each file it files so that python knows the full path / filename to trigger parser
    tree = ET.parse(fullname)
    # Parse the files..
    print(tree)
    # Get the root of the XML tree structure
    root = tree.getroot()
    # Print the tags it finds from all the child elements from root
    for child in root:
        print(child.tag, child.text)

Indent was wrong, my thanks to Jack Fleeting.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.