4

I'm writing a small scheme interpreter in Scala and I'm running into problems parsing lists in Scheme. My code parses lists that contain multiple numbers, identifiers, and booleans, but it chokes if I try to parse a list containing multiple strings or lists. What am I missing?

Here's my parser:

class SchemeParsers extends RegexParsers {

// Scheme boolean #t and #f translate to Scala's true and false
def bool : Parser[Boolean] = 
    ("#t" | "#f") ^^ {case "#t" => true; case "#f" => false}

// A Scheme identifier allows alphanumeric chars, some symbols, and 
// can't start with a digit
def id : Parser[String] = 
    """[a-zA-Z=*+/<>!\?][a-zA-Z0-9=*+/<>!\?]*""".r ^^ {case s => s}

// This interpreter only accepts numbers as integers
def num : Parser[Int] = """-?\d+""".r ^^ {case s => s toInt}

// A string can have any character except ", and is wrapped in "
def str : Parser[String] = '"' ~> """[^""]*""".r <~ '"' ^^ {case s => s}

// A Scheme list is a series of expressions wrapped in ()
def list : Parser[List[Any]] = 
    '(' ~> rep(expr) <~ ')' ^^ {s: List[Any] => s}

// A Scheme expression contains any of the other constructions
def expr : Parser[Any] = id | str | num | bool | list ^^ {case s => s}
} 
5
  • How do you handle whitespace? Commented Dec 6, 2010 at 6:40
  • 1
    Why do you need ^^ {case s => s} ? Commented Dec 6, 2010 at 15:19
  • @MJP +1 , ^^ {case s => s} can be removed Commented Dec 6, 2010 at 15:22
  • @Gabe One would presume that RegexParsers default handling of whitespace is in force. Commented Dec 6, 2010 at 18:44
  • Failing test cases would be useful. Commented Dec 6, 2010 at 18:52

2 Answers 2

3

As it was correctly pointed out by @Gabe, you left some white-spaces unhandled:

scala> object SchemeParsers extends RegexParsers {
     |
     | private def space  = regex("[ \\n]*".r)
     |
     | // Scheme boolean #t and #f translate to Scala's true and false
     | private def bool : Parser[Boolean] =
     |     ("#t" | "#f") ^^ {case "#t" => true; case "#f" => false}
     |
     | // A Scheme identifier allows alphanumeric chars, some symbols, and
     | // can't start with a digit
     | private def id : Parser[String] =
     |     """[a-zA-Z=*+/<>!\?][a-zA-Z0-9=*+/<>!\?]*""".r
     |
     | // This interpreter only accepts numbers as integers
     | private def num : Parser[Int] = """-?\d+""".r ^^ {case s => s toInt}
     |
     | // A string can have any character except ", and is wrapped in "
     | private def str : Parser[String] = '"' ~> """[^""]*""".r <~ '"' <~ space ^^ {case s => s}
     |
     | // A Scheme list is a series of expressions wrapped in ()
     | private def list : Parser[List[Any]] =
     |     '(' ~> space  ~> rep(expr) <~ ')' <~ space ^^ {s: List[Any] => s}
     |
     | // A Scheme expression contains any of the other constructions
     | private def expr : Parser[Any] = id | str | num | bool | list ^^ {case s => s}
     |
     | def parseExpr(str: String) = parse(expr, str)
     | }
defined module SchemeParsers   

scala> SchemeParsers.parseExpr("""(("a" "b") ("a" "b"))""")
res12: SchemeParsers.ParseResult[Any] = [1.22] parsed: List(List(a, b), List(a, b))

scala> SchemeParsers.parseExpr("""("a" "b" "c")""")
res13: SchemeParsers.ParseResult[Any] = [1.14] parsed: List(a, b, c)

scala> SchemeParsers.parseExpr("""((1) (1 2) (1 2 3))""")
res14: SchemeParsers.ParseResult[Any] = [1.20] parsed: List(List(1), List(1, 2), List(1, 2, 3))
Sign up to request clarification or add additional context in comments.

3 Comments

Excellent! Thank you! I thought the Regex parser handled whitespace for me, but I see I was mistaken.
RegexParsers skip space automatically at the beginning of any literal string or regular expression. I wonder if the problem is related to the Char being used...
@stomcavage RegexParsers does handle space for you, just not with Char.
1

The only problem with the code is your usage of characters instead of strings. Below, I removed the redundant ^^ { case s => s } and replaced all characters with strings. I'll further discuss this issue below.

class SchemeParsers extends RegexParsers {

// Scheme boolean #t and #f translate to Scala's true and false
def bool : Parser[Boolean] = 
    ("#t" | "#f") ^^ {case "#t" => true; case "#f" => false}

// A Scheme identifier allows alphanumeric chars, some symbols, and 
// can't start with a digit
def id : Parser[String] = 
    """[a-zA-Z=*+/<>!\?][a-zA-Z0-9=*+/<>!\?]*""".r ^^ {case s => s}

// This interpreter only accepts numbers as integers
def num : Parser[Int] = """-?\d+""".r ^^ {case s => s toInt}

// A string can have any character except ", and is wrapped in "
def str : Parser[String] = "\"" ~> """[^""]*""".r <~ "\""

// A Scheme list is a series of expressions wrapped in ()
def list : Parser[List[Any]] = 
    "(" ~> rep(expr) <~ ")" ^^ {s: List[Any] => s}

// A Scheme expression contains any of the other constructions
def expr : Parser[Any] = id | str | num | bool | list
}

All Parsers have an implicit accept for their Elem types. So, if the basic element is a Char, such as in RegexParsers, then there's an implicit accept action for them, which is what happens here for the symbols (, ) and ", which are characters in your code.

What RegexParsers do automatically is to skip white spaces (defined as protected val whiteSpace = """\s+""".r, so you could override that) automatically at the beginning of any String or Regex. It also takes care of moving the positioning cursor past the white space in case of error messages.

One consequence of this that you seem not to have realized is that " a string beginning with a space" will have its prefix space removed from the parsed output, which is very unlikely to be something you want. :-)

Also, since \s includes new lines, a new line will be acceptable before any identifier, which may or may not be what you want.

You may disable space skipping in your regex as a whole by overrideing skipWhiteSpace. On the other hand, the default skipWhiteSpace tests for whiteSpace's length, so you could potentially turn it on and off just by manipulating the value of whiteSpace throughout the parsing process.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.