0

I need to write a code that parses some language. I got stuck on parsing variable name - it can be anything that is at least 1 char long, starts with lowercase letter and can contain underscore '_' character. I think I made a good start with following code:

identToken :: Parser String
identToken = do 
                       c <- letter
                       cs <- letdigs
                       return (c:cs)
             where letter = satisfy isLetter
                   letdigs = munch isLetter +++ munch isDigit +++ munch underscore
                   num = satisfy isDigit
                   underscore = \x -> x == '_'
                   lowerCase = \x -> x `elem` ['a'..'z'] -- how to add this function to current code?

ident :: Parser Ident
ident = do 
          _ <- skipSpaces
          s <- identToken
          skipSpaces; return $ s

idents :: Parser Command
idents = do 
          skipSpaces; ids <- many1 ident
          ...

This function however gives me a weird results. If I call my test function

test_parseIdents :: String -> Either Error [Ident]
test_parseIdents p = 
  case readP_to_S prog p of
    [(j, "")] -> Right j
    [] -> Left InvalidParse
    multipleRes -> Left (AmbiguousIdents multipleRes)
  where
    prog :: Parser [Ident]
    prog = do
      result <- many ident
      eof
      return result

like this:

test_parseIdents  "test"

I get this:

Left (AmbiguousIdents [(["test"],""),(["t","est"],""),(["t","e","st"],""),
    (["t","e","st"],""),(["t","est"],""),(["t","e","st"],""),(["t","e","st"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],"")])

Note that Parser is just synonym for ReadP a.

I also want to encode in the parser that variable names should start with a lowercase character.

Thank you for your help.

1 Answer 1

3

Part of the problem is with your use of the +++ operator. The following code works for me:

import Data.Char
import Text.ParserCombinators.ReadP

type Parser a = ReadP a
type Ident = String

identToken :: Parser String
identToken = do c <- satisfy lowerCase
                cs <- letdigs
                return (c:cs)
  where lowerCase = \x -> x `elem` ['a'..'z']
        underscore = \x -> x == '_'
        letdigs = munch (\c -> isLetter c || isDigit c || underscore c)

ident :: Parser Ident
ident = do _ <- skipSpaces
           s <- identToken
           skipSpaces
           return s

test_parseIdents :: String -> Either String [Ident]
test_parseIdents p = case readP_to_S prog p of
    [(j, "")]   -> Right j
    []          -> Left "Invalid parse"
    multipleRes -> Left ("Ambiguous idents: " ++ show multipleRes)
  where prog :: Parser [Ident]
        prog = do result <- many ident
                  eof
                  return result

main = print $ test_parseIdents "test_1349_zefz"

So what went wrong:

  • +++ imposes an order on its arguments, and allows for multiple alternatives to succeed (symmetric choice). <++ is left-biased so only the left-most option succeeds -> this would remove the ambiguity in the parse, but still leaves the next problem.

  • Your parser was looking for letters first, then digits, and finally underscores. Digits after underscores failed, for example. The parser had to be modified to munch characters that were either letters, digits or underscores.

I also removed some functions that were unused and made an educated guess for the definition of your datatypes.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.