2

I need a method that makes a list from a string like following:

"( * 1 2 ( - 4 3 ) )" -> ["*", 1, 2, ["-", 4, 3]] 

Is there any simple way to handle this problem?

2
  • 2
    Writing a Lisp interpreter? Commented Jan 25, 2014 at 23:52
  • Recursive descent parser. Commented Jan 26, 2014 at 0:01

3 Answers 3

3

Some thing like this:

a = " ( * 1 2 ( - 4 35 ) ( + 100 ( / 1 2 ) ) ( + 100 200 ) )"

def p(s):
    r = []
    for x in s:
        if x == "(":
            r.append( p(s) )
        elif x == ")":
            return r
        else:
            r.append(x)
    return r

p(iter(a.split()))

the output is:

Out[23]:

[['*',
  '1',
  '2',
  ['-', '4', '35'],
  ['+', '100', ['/', '1', '2']],
  ['+', '100', '200']]]

You need add some code to convert string to number.

Sign up to request clarification or add additional context in comments.

Comments

3

use pyparsing, like:

from pyparsing import *

enclosed = Forward()
nestedParens = nestedExpr('(', ')', content=enclosed) 
integer = Word( nums ) # simple unsigned integer
arithOp = Word( "+-*/", max=1 ) # arithmetic operators
enclosed << ( nestedParens | arithOp | integer )

data = '( * 1 2 ( - 4 3 ) )' 

print enclosed.parseString(data).asList()

output:

$ python parse.py 
[['*', '1', '2', ['-', '4', '3']]]

2 Comments

It is not necessary to use a Forward to show that nestedExpr might include like nestedExpr's - that's why it's named nestedExpr. :) All that is needed is just expr = nestedExpr('(',')', content=integer|arithOp) and then use expr.parseString(data) on the original string. Otherwise, nice answer, thanks for mentioning pyparsing!
Also, the OP had asked about doing on-the-fly conversion of ints or floats. Pyparsing allows you to define parse actions that will do this kind of conversion at parse time. integer = Word(nums).setParseAction(lambda tokens: int(tokens[0])) will do this, similar for float.
2

There's no simple built-in to call or one-liner trick you can use, but it's still entirely manageable.

First, you'll want to tokenize the input. Roughly speaking, that means separating it into units like (, *, and 123. If your input is guaranteed to be space-separated, you can just use the split method, but if you need to handle input like (* (+ 1 2) 3), it could be a bit harder.

>>> "( * 1 2 ( - 4 3 ) )".split()
['(', '*', '1', '2', '(', '-', '4', '3', ')', ')']

Now that you have a sequence of tokens, we can write a recursive parser. Go over the tokens one at a time.

  • If you see a number, call int or float on it and return the result.
  • If you see something that isn't a number or a parenthesis, just return it as a string.
  • If you see an opening parenthesis, recursively parse objects from the token sequence and add them to a list until you see a closing parenthesis. Return the list.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.