0

I have a text to be parsed, this is a concise form of the text.

apple {
    type=fruit
    varieties {
        color=red
        origin=usa
    }
}

the output should be as shown below

apple.type=fruit
apple.varieties.color=red
apple.varieties.origin=usa

So far the only thing I have come up with is a sort of breadth-first approach in python. But I cant figure out how to get all the children within.

progInput = """apple {
    type=fruit
    varieties {
        color=red
        origin=usa
    }
}
"""
progInputSplitToLines = progInput.split('\n')
childrenList = []
root = ""

def hasChildren():
    if "{" in progInputSplitToLines[0]:
        global root
        root = progInputSplitToLines[0].split(" ")[0]
    for e in progInputSplitToLines[1:]:
        if "=" in e:
            childrenList.append({e.split("=")[0].replace("    ", ""),e.split("=")[1].replace("    ", "")})
hasChildren()

PS: I looked into tree structures in Python and came across anytree (https://anytree.readthedocs.io/en/latest/), do you think it would help in my case?

Would you please be able to help me out ? I'm not very good at parsing text. thanks a bunch in advance. :)

0

1 Answer 1

1

Since your file is in HOCON format, you can try using the pyhocon HOCON parser module to solve your problem.

Install: Either run pip install pyhocon, or download the github repo and perform a manual install with python setup.py install.

Basic usage:

from pyhocon import ConfigFactory

conf = ConfigFactory.parse_file('text.conf')

print(conf)

Which gives the following nested structure:

ConfigTree([('apple', ConfigTree([('type', 'fruit'), ('varieties', ConfigTree([('color', 'red'), ('origin', 'usa')]))]))])

ConfigTree is just a collections.OrderedDict(), as seen in the source code.

UPDATE:

To get your desired output, you can make your own recursive function to collect all paths:

from pyhocon import ConfigFactory
from pyhocon.config_tree import ConfigTree

def config_paths(config):
    for k, v in config.items():
        if isinstance(v, ConfigTree):
            for k1, v1 in config_paths(v):
                yield (k,) + k1, v1
        else:
            yield (k,), v

config = ConfigFactory.parse_file('text.conf')
for k, v in config_paths(config):
    print('%s=%s' % ('.'.join(k), v))

Which Outputs:

apple.type=fruit
apple.varieties.color=red
apple.varieties.origin=usa
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.