5

So I will ask the question somewhat vaugley because I'm first not sure if the question can be asked.. Here goes,

I want to read an XML tree in using Python3 which I am new to. I have accomplished this with relative ease using:

xml.etree.ElementTree.parse(urllib.request.urlopen(url))

The XML stream are different data sets and there is a XSD which is available which I have also parsed int he same way. Now, my question is can I create a parser using the XSD schema? I am new to XML in this way, but I have found examples where the parser object was generated using the XSD then the XML was read in accordingly. However, I cannot find the equivalent in Python3.

Here is vaugely what I want in Python2.X:

schema = etree.XMLSchema(schema_root)
xmlparser = etree.XMLParser(schema=schema)

I'm not sure if I'm even conceptualizing this correctly. Maybe this is an XML problem not a python problem, i.e., maybe you can only validate the XML against the schema and not actually use it to parse with the specifics from the XSD. Anyone help clear this up?

1 Answer 1

2

xmlschema is a Python module to manage XML Schemas or validate XML instances to the XSD.

Example to validate XML documents from URL against the XML Schema using Python3.

import requests
import xmlschema
import xml.etree.ElementTree as ET

# to make it simple use external XML Schema and create a local file from it to validate XML examples

xsd_url = "https://raw.githubusercontent.com/sissaschool/xmlschema/master/tests/test_cases/examples/collection/collection.xsd"

with open("test.xsd", "w", newline="") as out:
    out.write(requests.get(xsd_url).text)
xsd = xmlschema.XMLSchema("test.xsd")

# XML #1 validates to the Schema
url1 = "https://raw.githubusercontent.com/sissaschool/xmlschema/master/tests/test_cases/examples/collection/collection.xml"
xt = ET.fromstring(requests.get(url1).text)
print("xml1 valid=", xsd.is_valid(xt), sep="")

# XML #2 with invalid structure
url2 = "https://raw.githubusercontent.com/sissaschool/xmlschema/master/tests/test_cases/examples/collection/collection-1_error.xml"
xt = ET.fromstring(requests.get(url2).text)
print("xml2 valid=", xsd.is_valid(xt), sep="")

Output:

xml1 valid=True
xml2 valid=False
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.