1

I am trying to read an XML file from a command line argument. I am new to using libxml2 and XPath in general. I want to query using XPath.

XML:

<?xml version="1.0"?>                                                                                                                                     
<xmi:XMI xmlns:cas="http:///text/cas.ecore" xmlns:audioform="http:something" xmlns:xmi="http://blahblah" xmlns:lib="http://blahblah" xmlns:solr="http:blahblah" xmlns:tcas="http:///blah" xmi:version="2.0">                                                
  <cas:NULL xmi:id="0"/>                                                                                                                                     
  <cas:Sofa xmi:id="9" Num="1" ID="First" Type="text" String="play a song"/>    
  <cas:Sofa xmi:id="63" Num="2" ID="Second" Type="text" String="Find a contact"/>     
  <cas:Sofa xmi:id="72" Num="3" ID="Third" Type="text" String="Send a message"/>     
  <lib:Confidence xmi:id="1" sofa="9" begin="0" end="1" key="context" value="" confidence="1.0"/>                                                                          
</xmi:XMI>

Code:

def main(argv):
  try:
     xmlfile=argv[0]
     doc=libxml2.parseFile(xmlfile)
     root2=doc.children

     print root2  # This prints everything but <?xml version="1.0"?> 
     result= root2.xpathEval("//*")

     for node in result:
       print node
       print node.nodePath(), node.name, node.content

I want to go further and do some kind of processing using this file.

  1. How do I get values like 63 using xpath ? from xmi:id="63".
  2. Find String where xmi:id = "72". Result should be "Send a message"
  3. Find string where xmi:id = 72 and ID= "Third". Result should be "Send a message"
  4. I tried using node.Path(), node.name and node.content for this node:

    <cas:Sofa xmi:id="9" Num="1" ID="First" Type="text" String="play a song"/>
    

    The results are: /xmi:XMI/cas:Sofa[1] as nodePath(), Sofa as name and prints no content

How do I go about getting 1 and 2 and 3?

2 Answers 2

1

with respect to namespaces:

>>> from lxml import etree
>>> doc = etree.parse('in.html')
>>> names = {'cas':'http:///text/cas.ecore', 'xmi': 'http://blahblah'}
>>> doc.xpath('//cas:Sofa[@xmi:id="63"]', namespaces=names)
[<Element {http:///text/cas.ecore}Sofa at 0x10550a5f0>]
>>> doc.xpath('//cas:Sofa[@xmi:id="63"]/@String', namespaces=names)
['Find a contact']
>>> doc.xpath('//cas:Sofa[@xmi:id="72" and @ID="Third"]/@String', namespaces=names)
['Send a message']
Sign up to request clarification or add additional context in comments.

1 Comment

Hi Guy, Could you give the first few lines of the program. I get ttributeError: 'ElementTree' object has no attribute 'xpath' How do I complete this: def main(argv): elem_list=[] elem_num=0 try: xmlfile=argv[0] doc=ET.parse(xmlfile) root=doc.getroot() for child in root:`
0

I'm not familiar with Python, but the following XPaths should do:

1.) //*/@xmi:id

2.) //*[@xmi:id='72']/@String

3.) //*[@xmi:id='72' and @ID='Third']/@String

Attributes are selected with @, conditions are created in brackets ([]).

Be aware that your XML uses namespaces. Instead of just selecting everything (//*), you should consider more specific XPaths (/xmi:XMI/cas:Sofa) and using a namespace manager.

1 Comment

Thanks, but could you give me the complete command, just to know if I am missing anything. I get an xmlXpathEval() failed error. :( Coul dI use the xpath on each node?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.