4

I seem to be coming across mapping keywords straight to a datatype fairly often and I solve it as below. It can quickly get out of hand as you have to repeat the string values.

Is there a more compact way to express this?

import Text.ParserCombinators.Parsec

data Keyword = Apple | Banana | Cantaloupe

parseKeyword :: Parser Keyword
parseKeyword = (  string "apple"
              <|> string "banana"
              <|> string "cantaloupe"
               ) >>= return . strToKeyword
                    where strToKeyword str = case str of
                           "apple"      -> Apple
                           "banana"     -> Banana
                           "cantaloupe" -> Cantaloupe

EDIT:

As a followup question, since this seemed to be too easy. How would the compact solution work with try?

E.g.

import Text.ParserCombinators.Parsec

data Keyword = Apple | Apricot | Banana | Cantaloupe

parseKeyword :: Parser Keyword
parseKeyword = (  try (string "apple")
              <|> string "apricot"
              <|> string "banana"
              <|> string "cantaloupe"
               ) >>= return . strToKeyword
                    where strToKeyword str = case str of
                           "apple"      -> Apple
                           "apricot"    -> Apricot
                           "banana"     -> Banana
                           "cantaloupe" -> Cantaloupe

3 Answers 3

9

If you just want to avoid some repetition, you could use the (<$) operator:

import Text.ParserCombinators.Parsec
import Control.Applicative ((<$))

data Keyword = Apple | Banana | Cantaloupe

parseKeyword :: Parser Keyword
parseKeyword
    =   Apple      <$ string "apple"
    <|> Banana     <$ string "banana"
    <|> Cantaloupe <$ string "cantaloupe"

It's also possible to make a fully generic solution for any type that only has unit constructors using GHC.Generics:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeOperators #-}

import Text.ParserCombinators.Parsec
import Control.Applicative ((<*))
import Data.Char (toLower)
import GHC.Generics

class GParse f where
    gParse :: Parser (f a)

instance (GParse f, Constructor c) => GParse (C1 c f) where
    gParse = fmap M1 gParse <* string (map toLower $ conName (undefined :: t c f a))

instance GParse f => GParse (D1 c f) where
    gParse = fmap M1 gParse

instance (GParse a, GParse b) => GParse (a :+: b) where
    gParse = try (fmap L1 gParse) <|> fmap R1 gParse

instance GParse U1 where
    gParse = return U1

genericParser :: (Generic g, GParse (Rep g)) => Parser g
genericParser = fmap to gParse

That's quite a lot of boilerplate, but now you can create a parser for any compatible type with just:

{-# LANGUAGE DeriveGeneric #-}

data Keyword = Apricot | Apple | Banana | Cantaloupe deriving (Show, Generic)

parseKeyword :: Parser Keyword
parseKeyword = genericParser

Testing in GHCI:

> parseTest parseKeyword "apple"
Apple
> parseTest parseKeyword "apricot"
Apricot
> parseTest parseKeyword "banana"
Banana

Handling multi-word constructors like RedApple is just a matter of writing the strings translation function for "RedApple" -> "red_apple" and using that in the C1 instance. I.e.

import Data.List (intercalate)
import Data.Char (toLower, isLower)

mapName :: String -> String
mapName = intercalate "_" . splitCapWords where
    splitCapWords "" = []
    splitCapWords (x:xs) =
        let (word, rest) = span isLower xs
        in (toLower x : word) : splitCapWords rest

instance (GParse f, Constructor c) => GParse (C1 c f) where
    gParse = fmap M1 gParse <* string (mapName $ conName (undefined :: t c f a))
Sign up to request clarification or add additional context in comments.

2 Comments

Interesting addition. A little over my head at this stage. Would that still be a good solution if you started getting more complicated keyword to data mappings e.g. "red_apple" -> "RedApple"?
Added an example of how to parse constructor names like RedApple.
5

I’m not sure this is a terribly elegant solution, but if you derive a few more typeclasses:

data Keyword = Apple | Banana | Cantaloupe deriving (Eq, Read, Show, Enum, Bounded)

You can suddenly get all of the values:

ghci> [minBound..maxBound] :: [Keyword]
[Apple,Banana,Cantaloupe]

For any particular value, we can parse it and then return the value:

parseEnumValue :: (Show a) => a -> Parser a
parseEnumValue val = string (map toLower $ show val) >> return val

Then we can combine these to parse any value of it:

parseEnum :: (Show a, Enum a, Bounded a) => Parser a
parseEnum = choice $ map parseEnumValue [minBound..maxBound]

Try it out:

ghci> parseTest (parseEnum :: Parser Keyword) "cantaloupe"
Cantaloupe
ghci> parseTest (parseEnum :: Parser Keyword) "orange"
parse error at (line 1, column 1):
unexpected "o"
expecting "apple", "banana" or "cantaloupe"

2 Comments

Definitely the most creative solution. Elegant, not so much.
I think I didn't read it properly at first. Looking it over it is actually fairly elegant.
5

How about this?

parseKeyword
    =   (string "apple"      >> return Apple)
    <|> (string "banana"     >> return Banana)
    <|> (string "cantaloupe" >> return Cantaloupe)

For your follow up, this seems to work equally well as your implementation for the half dozen random tests I did

parseKeyword :: Parser Keyword
parseKeyword
    = try (string "apple"      >> return Apple)
    <|>   (string "apricot"    >> return Apricot)
    <|>   (string "banana"     >> return Banana)
    <|>   (string "cantaloupe" >> return Cantaloupe)

The technique is just making each subexpression return the final type instead of delegating it to the end of the block for a case statement. The returns don't change the behavior of the parser.

2 Comments

I accepted shang's answer because he got in first but I will actuall use yours so I don't have to learn another operator just yet.
@kasbah Whichever you want to use is fine. shang's solution is nice for its brevity, but it's up to you if you feel like using an alternate Functor operator, which is defined simply as fmap . const, which may be easier to visualize by seeing 5 <$ Just 1 == Just 5, 5 <$ Nothing == Nothing. Since Parser is a Functor, this works to basically replace the value inside the Parser container with what was specified on the < side of the operator. This only occurs if the parse succeeded.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.