2

I am working on building a parser in Haskell using parser combinators. I have an issue with parsing keywords such as "while", "true", "if" etc

So the issue I am facing is that after a keyword there is a requirement that there is a separator or whitespace, for example in the statement if cond then stat1 else stat2 fi;x = 1 with this statement all keywords have either a space in front of them or a semi colon. However in different situations there can be different separators.

Currently I have implemented it as follows:

keyword :: String -> Parser String
keyword k = do
  kword <- leadingWS (string k)
  check (== ';') <|> check isSpace <|> check (== ',') <|> check (== ']')
  junk
  return word

however the problem with this keyword parser is that it will allow programs which have statements like if; cond then stat1 else stat2 fi

We tried passing in a (Char -> Bool) to keyword, which would then be passed to check. But this wouldn’t work because where we parse the keyword we don’t know what kind of separator is allowed.

I was wondering if I could have some help with this issue?

2
  • 2
    Just because a keyword is followed by a semi-colon, that doesn't mean it is being used correctly. Keywords are always part of some larger syntactic structure. Namely, a keyword is just a specific token which your grammar is built upon. The only difference between a keyword and an identifier is that the keyword is recognized by the grammar, rather than treated as a generic token left to later stages of parsing and evaluation. Commented Sep 8, 2016 at 18:34
  • The semi-colon itself is another token; you should have another combinator that recognizes correct collections of other tokens. (In this sense, the parser is really doing both lexical analysis--converting a stream of bytes into a stream of tokens--and parsing--converting a stream of tokens into an abstract syntax tree.) Commented Sep 8, 2016 at 18:43

1 Answer 1

3

Don't try to handle the separators in keyword but you need to ensure that keyword "if" will not be confused with an identifier "iffy" (see comment by sepp2k).

keyword :: String -> Parser String
keyword k = leadingWS $ try (do string k
                                notFollowedBy alphanum)

Handling separators for statements would go like this:

statements = statement `sepBy` semi
statement  = ifStatement <|> assignmentStatement <|> ...
Sign up to request clarification or add additional context in comments.

3 Comments

If you do it like this, the identifier "iffy" could be instead parsed as the keyword "if" followed by the identifier "fy". I believe (and I admit I'm totally guessing) the reason why OP decided that "after a keyword there is a requirement that there is a separator or whitespace" is to disallow this.
Oh i see, perhaps I missed that point in the question. I'll give it some thought and update my answer if I have anything to add.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.