I have a string in python of the following form:
line a
line b
line ba
line bb
line bba
line bc
line c
line ca
line caa
line d
You can get the idea. It actually takes a very similar form to python code itself, in that there is a line, and below that line, indentations indicate part of a block, headed by the most recent line of a lesser indentation.
What I need to do is parse this code into a tree sructure, such that each root level line is the key of a dictionary, and its value is a dictionary representing all sublines. so the above would be:
{
'line a' => {},
'line b' => {
'line ba' => {},
'line bb' => {
'line bba' => {}
},
'line bc' => {}
},
'line c' => {
'line ca' => {
'line caa' => {}
},
},
'line d' => {}
}
here's what I've got:
def parse_message_to_tree(message):
buf = StringIO(message)
return parse_message_to_tree_helper(buf, 0)
def parse_message_to_tree_helper(buf, prev):
ret = {}
for line in buf:
line = line.rstrip()
index = len(line) - len(line.lstrip())
print (line + " => " + str(index))
if index > prev:
ret[line.strip()] = parse_message_to_tree_helper(buf, index)
else:
ret[line.strip()] = {}
return ret
The print shows lines that are left stripped and indexes of 0. I didn't think lstrip() was a mutator, but either way the index should still be accurate.
Any suggestions are helpful.
EDIT: Not sure what was going wrong before, but I tried again and it is closer to working, but still not quite right. Here's what I have now:
{'line a': {},
'line b': {},
'line ba': {'line bb': {},
'line bba': {'line bc': {},
'line c': {},
'line ca': {},
'line caa': {},
'line d': {}}}}
tree = lambda: defaultdict(tree); t = tree(); t['a']['b']['c'] = "bla"