1

I need some help on a homework question on use of recursive functions.

We want to represent binary trees, where each leaf has a string as a label, by nested tuples. We require the leaves are labeled with distinct non-empty strings and all non-leaf nodes have exactly two children. E.g. ((('A', 'B'), 'C'), ('D', ('F', 'E'))) is a valid binary tree.

And I'm to write a recursive function valid_binary_tree(tree) that checks, i.e. returns True or False, if a given tree is a recursive tuple representing a binary tree as described above.

I've divided the issue into two functions. One, which checks if a tuple of strings is valid. I.e. it is a tuple of distinct strings, such as ('a','b','c'), but not ('a','a','c') or such.

def validate_string_tuple(t):
"""
Validates whether input is a tuple of distinct strings.
1st step; check if input is a tuple
    if not, return False
2nd step; go through tuple; finds all 'leaves' of the tuple
    If any element is not a string, return False
3rd step; check for uniqueness of 'leaves'
    If length(list_of_leaves) != len(set_of_leaves) return False
"""
if isinstance(t,tuple):
    
    leaves_l=[]
    for i in range(len(t)):
        
        if isinstance(t[i],str) and t[i]!="":
            #Element in the tuple is a string. Append to list of leaves.
            leaves_l.append(t[i])
        else:
            #An element in the tuple is not a string.
            return False
    return (len(leaves_l)==len(set(leaves_l)))

else: return False

This seems to work well enough. I've tested it with ('ab',('c','d')) (False), ('a','a','b','c') (False), ('a','b','c') (True).

The latter part, for checking an entire tree, I've made rather a mess, I think. It works for some inputs, for instance it returns True if when used on ((('a','b'),'c'),('e',('f','g'))). However, it also returns true on, say, (('a','b'),'c','d') which it shouldn't (as far as I can tell from the description?). Regardless, it's very inelegant, and I'm sure it can be much improved.

def valid_binary_tree(tree):
"""
Checks if a tree (tuple) is a valid binary tree. All non-leaf nodes have
exactly two children, and all leaves are labeled by distinct strings.

- If the initial input is not a tuple, immediately return false.
"""
if not isinstance(tree,tuple):
    return False

#Initially true "list", since, for whatever reason, simply returning False below did not work.
bools=[True]
#Empty list to be populated with leaves (for checking uniqueness later)
leaves_l=[]
def traverse(tree):
    nonlocal leaves_l
    nonlocal bools
    
    if isinstance(tree, str):
        leaves_l+=(tree) #if we've gotten to a simple string (non tuple), it must be a leaf (str)
    
    elif validate_string_tuple(tree): #we've found a legit tuple.
        if len(tree)==2:
            leaves_l+=tree            #The tuple is of length 2, i.e. binary. Append to leaves.
            print('found a valid string tuple;',tree)
        else:
            print('found a string tuple of length >2') #non-binary tuple.
#return False had no effect here, even when the print was called. Hence the bools.append...
            bools.append(False)
    
    else:
        for child in tree: #recursively goes through tree, as it could have arbitrarily many nested tuples.
            traverse(child)

traverse(tree)

#check leaves list vs. set
if len(leaves_l)!=len(set(leaves_l)):
    print('Non-unique labels! Not a valid binary tree')
    return False
#Checks if any non-legit tuples have shown up
if all(bools): 
    return True
else: 
    return False

I also recognize that this cannot possibly be the most elegant way to do it, even if it does work for some instances.

1 Answer 1

2

It seems like you're overcomplicating it -- there's a pretty simple set of checks that need to be applied to figure out if a given tree (or any given node in that tree) is valid or not.

  • Is it a leaf we've seen before? False
  • Is it a leaf we haven't seen before? True
  • Is it not a tuple, or a tuple of the wrong length? False
  • Do both of the child nodes validate? (This is the recursive part!)

The use of nonlocal and multiple levels of indirection make your code hard to debug, but there are a couple of things I can easily imagine going wrong:

  • Your nonlocal state is "leaking" from one call to the next. (This sort of thing is exactly why most coders avoid global/nonlocal; this type of bug is absolutely fiendish and not fun to debug.)
  • Some important piece of data isn't making it from some helper function to the caller that needs it. (Again, this is complicated by the use of nonlocal etc -- when function contracts are muddy, it's hard to figure out who's failing to uphold them.)

You can avoid a lot of headache (and vastly simplify your code) by just creating a set on the initial call and explicitly passing it through all the recursive calls:

>>> def validate_tree(tree, leaves=None):
...     leaves = leaves or {None}
...     if isinstance(tree, str):
...         if tree in leaves:
...             return False  # duplicate leaf
...         leaves.add(tree)
...         return True  # unique leaf
...     if not isinstance(tree, tuple) or len(tree) != 2:
...         return False
...     return all(validate_tree(n, leaves) for n in tree)
...
>>> print(validate_tree(('a','a','b','c')))
False
>>> print(validate_tree(((('a','b'),'c'),('e',('f','g')))))
True
>>> print(validate_tree((('a','b'),'c','d')))
False
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a lot; very elegant solution. I'm still relatively new to programming, so I'm at the point of "acknowledge ignorance" - where I know my code is shoddy, but I don't see how to make it elegant just yet.
Don't use global/nonlocal, ever. (If you think you need to use nonlocal for something, think about what each function is taking as an argument and what it's returning, and whether that reflects what it actually does.) Have each function do one simple thing, and be fully in charge of that thing. (If you have two functions doing substantively the same thing, they should be one function, or they should be two functions that divide the labor more effectively.)
Thank you. Wrt. to nonlocal it was taken from some code in class. I probably messed up by taking it out of context. And thanks for the help/advice!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.