1

I'm currently using pyparsing to identify whether a nested parenthesis is being used in a string, in order to identify reference numbers being mistakenly concatenated to words.

For instance, 'apple(4)'.

I want to be able to identify the reference subtoken ('(4)'). However, when I use searchString, it returns a ParseResults object of [[7]], which doesn't provide the parenthesis. I want to find the substring in the original token, so I need to have the nest characters included in the ParseResults object. Ie, I want to search for '(4)'. Is there a way to make searchString return the nest characters.

1
  • Can you be more specific about just what those parenthetical expressions might look like, that you need to support nesting? nestedExpr is a quick-and-dirty helper to quickly jump over nested parens, braces, brackets, etc., preserving the structure from the nesting. If you just want the raw substring, wrap nestedExpr in originalTextFor, which should include the enclosing ()'s. But if you really want to make sense of the contents, then I'd suggest you define actual recursive expressions for them. Commented Jun 14, 2017 at 21:05

1 Answer 1

1

Question: Is there a way to make searchString return the nest characters.

Consider the following Examples:

data = 'apple(4), banana(13), juice(1)'

from pyparsing import Word, nums, alphas

nested = Word(alphas) + '(' + Word(nums) + ')'
for item in data.split((',')):
    print(item, "->", nested.searchString(item))

Output:

apple(4), ->[['apple', '(', '4', ')']]
 banana(13), ->[['banana', '(', '13', ')']]
 juice(1), ->[['juice', '(', '1', ')']]

import re

nObj = re.compile('(\w+?)(\(\d+\))')
findall = nObj.findall(data)
print('findall:{}'.format(findall))

Output:

findall:[('apple', '(4)'), ('banana', '(13)'), ('juice', '(1)')]

Tested with Python: 3.4.2

Sign up to request clarification or add additional context in comments.

2 Comments

This answer doesn't support nesting, but it is not clear from the OP's example that nesting is actually required. What you have here should work fine with the original sample text. Two other things you might try - add a parse action to auto-convert the Word(nums) to an int; and add results names to the "apple" and the "4" quantity to make access to them easier in the parsed results.
@Paul: I interpreted the OP nest characters with (...). Awaiting the OP comment ...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.