0

I am trying to parse a xml file with a name called sample.xml and the contents are as below.

<?xml version="1.0" encoding="UTF-8"?><gudid xmlns="http://www.fda.gov/cdrh/gudid" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0" xsi:schemaLocation="http://www.fda.gov/cdrh/gudid gudid.xsd">
<device xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.fda.gov/cdrh/gudid">
  <publicDeviceRecordKey>7c36b446-020c-44ab-9ce7-a85387467e0f</publicDeviceRecordKey>
  <publicVersionStatus>New</publicVersionStatus>
  <deviceRecordStatus>Published</deviceRecordStatus>
  <identifiers>
    <identifier>
      <deviceId>M930756120810</deviceId>
      <deviceIdType>Primary</deviceIdType>
      <deviceIdIssuingAgency>HIBCC</deviceIdIssuingAgency>
      <containsDINumber xsi:nil="true"></containsDINumber>
      <pkgQuantity xsi:nil="true"></pkgQuantity>
      <pkgDiscontinueDate xsi:nil="true"></pkgDiscontinueDate>
      <pkgStatus xsi:nil="true"></pkgStatus>
      <pkgType xsi:nil="true"></pkgType>
    </identifier>
  </identifiers>
  <brandName>Life Instruments</brandName>
  <gmdnTerms>
    <gmdn>
      <gmdnPTName>Orthopaedic knife</gmdnPTName>
      <gmdnPTDefinition>A hand-held manual surgical instrument designed for cutting/shaping bone during an orthopaedic surgical intervention. It is typically a heavy, one-piece instrument with a sharp, single-edged, strong cutting blade at the distal end available in various shapes and sizes, with a handle at the proximal end. It is normally made of high-grade stainless steel. This is a reusable device.</gmdnPTDefinition>
    </gmdn>
  </gmdnTerms>
  <productCodes>
    <fdaProductCode>
      <productCode>LXH</productCode>
      <productCodeName>Orthopedic Manual Surgical Instrument</productCodeName>
    </fdaProductCode>
  </productCodes>
  <deviceSizes/>
  <environmentalConditions/>
</device>
</gudid>

And i am using the below code to parse this xml file.

from lxml import etree

file = "sample.xml"

root = etree.parse(file).xpath(
    "x:device", namespaces={"x": "http://www.fda.gov/cdrh/gudid"}
)

for event, element in etree.iterwalk(root, events=("start", "end")):
    if event == "start":
        print(event, etree.QName(element).localname, element.text)
    if event == "end":
        element.clear()

However this line

for event, element in etree.iterwalk(root, events=("start", "end")):

errors out with an error like

TypeError: Invalid input object: list

I am unable to see where I am wrong here ?

1 Answer 1

1

The .xpath(...) method returns a list of elements, but you seem to be assuming it returns a single element. Check that the list isn't empty and then use the first element out of this list:

devices = etree.parse(file).xpath(
    "x:device", namespaces={"x": "http://www.fda.gov/cdrh/gudid"}
)

if not devices:
    raise ValueError("No devices found")

device = devices[0]
for event, element in etree.iterwalk(device, events=("start", "end")):
    # rest of loop omitted...

Note also that I've changed the name of the variable from root to device, as it's not the root of your XML document: the <gudid> element is.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.