Using monad for simple Haskell parser

Question

TL;DR

I'm trying to understand how this:

satisfy :: (Char -> Bool) -> Parser Char
satisfy pred = PsrOf p
  where
    p (c:cs) | pred c = Just (cs, c)
    p _ = Nothing

Is equivalent to this:

satisfy :: (Char -> Bool) -> Parser Char
satisfy pred = do
    c <- anyChar
    if pred c then return c else empty

Context

This is a snippet from some lecture notes on Haskell parsing, which I'm trying to understand:

import Control.Applicative
import Data.Char
import Data.Functor
import Data.List

newtype Parser a = PsrOf (String -> Maybe (String, a))
    -- Function from input string to:
    --
    -- * Nothing, if failure (syntax error);
    -- * Just (unconsumed input, answer), if success.

dePsr :: Parser a -> String -> Maybe (String, a)
dePsr (PsrOf p) = p

-- Monadic Parsing in Haskell uses [] instead of Maybe to support ambiguous
-- grammars and multiple answers.

-- | Use a parser on an input string.
runParser :: Parser a -> String -> Maybe a
runParser (PsrOf p) inp = case p inp of
                            Nothing -> Nothing
                            Just (_, a) -> Just a
                          -- OR: fmap (\(_,a) -> a) (p inp)

-- | Read a character and return. Failure if input is empty.
anyChar :: Parser Char
anyChar = PsrOf p
  where
    p "" = Nothing
    p (c:cs) = Just (cs, c)

-- | Read a character and check against the given character.
char :: Char -> Parser Char
-- char wanted = PsrOf p
--   where
--     p (c:cs) | c == wanted = Just (cs, c)
--     p _ = Nothing
char wanted = satisfy (\c -> c == wanted)   -- (== wanted)

-- | Read a character and check against the given predicate.
satisfy :: (Char -> Bool) -> Parser Char
satisfy pred = PsrOf p
  where
    p (c:cs) | pred c = Just (cs, c)
    p _ = Nothing
-- Could also be:
-- satisfy pred = do
--     c <- anyChar
--     if pred c then return c else empty

instance Monad Parser where
    -- return :: a -> Parser a
    return = pure

    -- (>>=) :: Parser a -> (a -> Parser b) -> Parser b
    PsrOf p1 >>= k = PsrOf q
      where
        q inp = case p1 inp of
                  Nothing -> Nothing
                  Just (rest, a) -> dePsr (k a) rest

I understand everything up until the last bit of the Monad definition, specifically I don't understand how the following line returns something of type Parser b as is required by the (>>=) definition:

Just (rest, a) -> dePsr (k a) rest

It's difficult for me grasp what the Monad definition means without an example. Thankfully, we have one in the alternate version of the satisfy function, which uses do-notation (which of course means the Monad is being called). I really don't understand do-notation yet, so here's the desugared version of satisfy:

satisfy pred = do
    anyChar >>= (c ->
    if pred c then return c else empty)

So based on the first line of our (>>=)definition, which is

PsrOf p1 >>= k = PsrOf q

We have anyChar as our PsrOf p1 and (c -> if pred c then return c else empty) as our k. What I don't get is how in dePsr (k a) rest that (k a) returns a Parser (at least it shold, otherwise calling dePsr on it wouldn't make sense). This is made more confusing by the presence of rest. Even if (k a) returned a Parser, calling dePsr would extract the underlying function from the returned Parser and pass rest to it as an input. This is definitely doesn't return something of type Parser b as required by the definition of (>>=). Clearly I'm misunderstanding something somewhere.

Note that dePsr (k a) rest is inside the q function. The q function is not directly given as a result of ... >>= .... The result of ... >>= ... is PsrOf q, not q. — David Young
– David Young, Commented Apr 5, 2019 at 1:54
Also, note that k has type a -> Parser b (this can be seen in the commented-out type signature). So, if you give k an a value, it will give you a Parser b value. This is why k a :: Parser b. — David Young
– David Young, Commented Apr 5, 2019 at 1:55

John F. Miller · Accepted Answer · 2019-04-08 18:08:48Z

5

Ok, Maybe this will help. Let's start by puting some points back into dePsr.

dePsr :: Parser a -> String -> Maybe (String, a)
dePsr (PsrOf p) rest = p rest

And let's also write out return: (NB I'm putting in all the points for clarity)

return :: a -> Parser a
return a = PsrOf (\rest -> Just (rest, a))

And now from the Just branch of the (>>=) definition

 Just (rest, a) -> dePsr (k a) rest

Let's make sure we agree on what every thing is:

rest the string remaining unparsed after p1 is applied
a the result of applying p1
k :: a -> Parser b takes the result of the previous parser and makes a new parser
dePsr unwraps a Parser a back into a function `String -> Maybe (String, a)

Remember we will wrap this back into a parser again at the top of the function: PsrOf q

So in English bind (>>=) take a parser in a and a function from a to a parser in b and returns a parser in b. The resulting parser is made by wrapping q :: String -> Maybe (String, b) in the Parser constructor PsrOf. Then q, the combined parser, take a String called inp and applies the function p1 :: String -> Maybe (String,a) that we got from pattern matching against the first parser, and pattern matches on the result. For an error we propagate Nothing (easy). If the first parser had a result we have tow pieces of information, the still unparsed string called rest and the result a. We give a to k, the second parser combinator, and get a Parser b which we need to unwrap with dePsr to get a function (String -> Maybe (String,b) back. That function can be applied to rest for the final result of the combined parsers.

I think the hardest part about reading this is that sometimes we curry the parser function which obscures what is actually happening.

Ok for the satisfy example

satisfy pred 
  = anyChar >>= (c -> if pred c then return c else empty)

empty comes from the alternative instance and is PsrOf (const Nothing) so a parser that always fails.

Lets look at only the successful branches. By substitution of only the successful part:

PsrOf (\(c:cs) ->Just (cs, c)) >>= (\c -> PsrOf (\rest -> Just (rest, c)))

So in the bind (>>=) definition

p1 = \(c:cs -> Just (cs, c))
k = (\c -> PsrOf (\rest -> Just (rest, c)))
q inp = let Just (rest,a) = p1 inp in dePsr (k a) rest again only successful branch

Then q becomes

q inp = 
  let Just (rest, a) = (\(c:cs) -> Just (cs, c)) inp
   in dePsr (\c -> PsrOf (\rest -> Just (rest, c))) a rest

Doing a little β-reduction

q inp =
 let (c:cs) = inp
     rest = cs
     a = c
  in dePsr (PsdOf (\rest -> Just (rest, a))) rest -- dePsr . PsrOf = id

Finally cleaning up some more

q (c:cs) = Just (cs, c)

So if pred is successful we reduce satisfy back to exactly anyChar which is exactly what we expect, and exactly what we find in the first example of the question. I will leave it as and exersize to the reader (read: I'm lazy) to prove that if either inp = "" or pred c = False that the outcome is Nothing as in the first satisfy example.

NOTE: If you are doing anything other than a class assignment, you will save yourself hours of pain and frustration by starting with error handling from the beginning make your parser String -> Either String (String,a) it is easy to make the error type more general later, but a PITA to change everything from Maybe to Either.

Question: "[C]ould you explain how you arrived at return a = PsrOf (\rest -> Just (rest, a)) from return = pure after you put "points" back into return?

Answer: First off, it is pretty unfortunate to give the Monad instance definition without the Functor and Applicative definitions. The pure and return functions must be identical (It is part of the Monad Laws), and they would be called the same thing except Monad far predates Applicative in Haskell history. In point of fact, I don't "know" what pure looks like, but I know what it has to be because it is the only possible definition. (If you want to understand the the proof of that statement ask, I have read the papers, and I know the results, but I'm not into typed lambda calculus quite enough to be confident in reproducing the results.)

return must wrap a value in the context without altering the context.

return :: Monad m => a -> m a
return :: a -> Parser a -- for our Monad
return :: a -> PsrOf(\str -> Maybe (rest, value)) -- substituting the constructor (PSUDO CODE)

A Parser is a function that takes a string to be parsed and returns Just the value along with any unparsed portion of the original string or Nothing on failure, all wrapped in the constructorPsrOf. The context is the string to be parsed, so we cannot change that. The value is of course what was passed toreturn`. The parser always succeeds so we must return Just a value.

return a =  PsrOf (\rest -> Just (rest, a))

rest is the context and it is passed through unaltered. a is the value we put into the Monad context.

For completeness here is also the only reasonable definition of fmap from Functor.

fmap :: Functor f => (a->b) -> f a -> f b
fmap :: (a -> b) -> Parser a -> Parser b -- for Parser Monad
fmap f (PsrOf p) = PsrOf q
  where q inp = case p inp of
    Nothing -> Nothing
    Just (rest, a) -> Just (rest, f a)
  -- better but less instructive definition of q
  -- q = fmap (\(rest,a) -> (rest, f a)) . p

edited Apr 8, 2019 at 18:08

answered Apr 5, 2019 at 1:59

John F. Miller

27.3k11 gold badges74 silver badges127 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Sakeeb Hossain Over a year ago

I followed everything you wrote, but the main thing I was confused about is still the application of the monad in satisfy. You know how we have k :: a -> Parser b? In satisfy, if you look at the do-notation version commented out below, the k function matches with (c -> if pred c then return c else empty). As you established, k should take in some a and return a Parser b but I don't see how it does that in this instance. There are only two outcomes: return c or empty (refers to the Alternative I believe) neither of which are of type Parser b.

K. A. Buhr Over a year ago

The type of return c is Monad m => m Char, and the type of empty is Alternative m => m Char. Because your Parser has both Monad (shown) and Applicative (not shown) instances, both these types unify with Parser Char, so the expression if pred c then return c else empty DOES have type Parser Char.

Robin Zigmond Over a year ago

You haven't given the Alternative instance for Parser here: but I think it's helpful in this context to know that empty would be defined as empty = PsrOf $ const Nothing (that is, it's a "useless" parser that fails for every input - but such "useless" things have a mysterious way of ending up being very useful for defining abstract concepts). In particular, the else empty in the do block version corresponds exactly to the p _ = Nothing in the other version.

Sakeeb Hossain Over a year ago

@John one last thing: could you explain how you arrived at return a = PsrOf (\rest -> Just (rest, a)) from return = pure after you put "points" back into return? How can we assume to use (\rest -> Just (rest, a))? Also seriously, thanks for your responses. You've cleared up a lot.

dfeuer Over a year ago

@SakeebHossain, the definition of pure or return can be calculated from the definition of >>= (or even <*>, I believe). The monad identity laws pure x >>= f = f x and m >>= pure = m tell you all you need to know. Indeed, you only need one for the calculation. I think this time you're better off using the second law.

Collectives™ on Stack Overflow

Using monad for simple Haskell parser

TL;DR

Context

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

TL;DR

Context

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related