2

I'm trying to write a thrift parser with pyparsing.

The parse result I want to see is a dict that maps element names to parsed tokens. After defining the elements, I call scanString on each of them to parse for the corresponding tokens, and then make a dict from the results.

But this requires multiple pass through the source, one for each of the elements, eg. one for parsing constants, one for exceptions, one for structs ...

Is it possible to parse multiple elements in one go and still be able to separate the tokens according to their types?

1 Answer 1

1

Define a single parser containing all of the elements you are looking for:

parser = OneOrMore(parserA | parserB | parserC)

If you have overlapping names, then group the subparsers, and keep them by name:

parser = OneOrMore(Group(parserA)("A*") | Group(parserB)("B*") | Group(parserC)("C*"))

The results names with the trailing asterisks will keep all of the parsed matches, not just the last one (take the '*' off and see the difference in the parsed results).

Now you can do:

results = parser.parseString(input)  # or use scanString or searchString
for aresult in results['A']:
    ...
for bresult in results['B']:
    ...
Sign up to request clarification or add additional context in comments.

3 Comments

Ooo... haven't had to use pyparsing for a bit... when did the * syntax come in (or is it just something I completely missed a year or more ago!?)
It's been around for a while (v 1.5.6, Jun/2011), it's a shortcut for parserA.setResultsName('A', listAllMatches=True) -> parserA('A*')
Was aware of listAllMatches=True - wasn't aware of the (parser)('A*') availability... thanks, Paul.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.