0

I am trying to write a function that will take an xml object, an arbitrary number of tags, defined by tuples containing a tag name, attribute and attribute value (e.g ('tag1', 'id', '1')) and return the most specific node possible. My code is below:

from xml.dom import minidom

def _search(object, *pargs):
    if len(pargs) == 0:
        print "length of pargs was zero"
        return object
    else:
        print "length of pargs is %s" % len(pargs)
    if pargs[0][1]:
        for element in object.getElementsByTagName(pargs[0][0]):
            if element.attributes[pargs[0][1]].value == pargs[0][2]:
                _search(element, *pargs[1:])
    else:
        if object.getElementsByTagName(pargs[0][0]) == 1:
            _search(element, *pargs[1:])
def main():
    xmldoc = minidom.parse('./example.xml')
    tag1 = ('catalog_item', 'gender', "Men's")
    tag2 = ('size', 'description', 'Large')
    tag3 = ('color_swatch', '', '')

    args = (tag1, tag2, tag3)
    node = _search(xmldoc, *args)
    node.toxml()
if __name__ == "__main__":
    main()

Unfortunately, this doesn't seem to work. Here's the output when I run the script:

$ ./secondsearch.py
length of pargs is 3
length of pargs is 2
length of pargs is 1
Traceback (most recent call last):
  File "./secondsearch.py", line 35, in <module>
    main()
  File "./secondsearch.py", line 32, in main
    node.toxml()
AttributeError: 'NoneType' object has no attribute 'toxml'

Why isn't the 'if len(pargs) == 0' clause being exercised? If I do manage to get the xml object returned to my main method, can I then pass the object into some other function (which could change the value of the node, or append a child node, etc.)?

Background: Using python to automate testing processes, environment is is cygwin on winxp/vista/7, python version is 2.5.2. I would prefer to stay within the standard library if at all possible.

Here's the working code:

def _search(object, *pargs):
    if len(pargs) == 0:
        print "length of pargs was zero"
    else:
        print "length of pargs is %s" % len(pargs)
    for element in object.getElementsByTagName(pargs[0][0]):
        if pargs[0][1]:
            if element.attributes[pargs[0][1]].value == pargs[0][2]:
                return _search(element, *pargs[1:])
        else:
            if object.getElementsByTagName(pargs[0][0]) == 1:
                return _search(element, *pargs[1:])
    return object
1
  • 1
    Your code seems to have been flattened; you should probably fix it so people can run it. Commented Jul 21, 2009 at 7:45

2 Answers 2

2

I assume you're using http://www.eggheadcafe.com/community/aspnet/17/10084853/xml-viewer.aspx as your sample data...

As Vinay pointed out, you don't return anything from your recursive calls to_search.

In your else case, you don't define the value of element, but you pass it into the _search().

Also, you don't do anything if pargs[0][1] is empty, but object.getElementsByTagName(pargs[0][0]) returns more than one Node... (which is also why your pargs == 0 case never gets hit...)

And after all that, if that sample data is correct, there are two matching nodes. so you'll have a NodeList containing:

        <color_swatch image="red_cardigan.jpg">Red</color_swatch>
        <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>

and you can't call .toxml() on a NodeList...

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. You made several valid points, all of which I kept in mind as I rewrote the function. The only one not represented in the code itself is the section dealing with a NodeList being returned - that got a comment in my script reminding me to include enough information to obtain a specific node.
2

Shouldn't you be inserting a return in front of your recursive calls to _search? The way you have it now, some exit paths from _search don't have a return statement, so they will return None - which leads to the exception you're seeing.

3 Comments

I'm not sure what you mean by 'inserting a return in front of your recursive calls', specifically 'in front of'. Could you explain that more fully?
Yes - your statement should be return _search(element, *pargs[1:]) rather than just _search(element, *pargs[1:])
Thanks for the clarification. I'm giving Stobor the solution, but I did up your score to reflect your significant assistance. Again, thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.