0

I want to implement a Haskell function wordToken that splits a string of words into a list of strings including the fullstops and commas

For example "the man saw." should result in ["the", "man","saw","."]

So what I did is check if the Char is a comma or fullstop , then just add it as is. Then if its a Char and then a Char , add them both. Else if its a Char and then a Space, add it and continue to the rest of the list. But I'm not sure how do I tell it separate the words themselves , or when I add a char to a char then thats a new string

 wordToken []= " "

 wordToken (x:y:z) | x==',' || x=='.' = " "(++)x:wordToken( y:z)
              | x/='\n' && y/='\n'= " "(++)x(++)y(++)wordToken z
              | x/='\n' && y=='\n'= " "(++)x:wordToken z
              |     otherwise = wordToken z 

I also tried to use the words function and just add the part of the punctuation but it gave me a type mismatch wordToken (x:xs) | x=='.' || x==',' = 'x':wordToken xs | otherwise =words (x:xs)

4
  • Have you looked up regex or parsing in Haskell? Commented Apr 17, 2020 at 21:20
  • No I havent looked up but I will check it out Commented Apr 17, 2020 at 21:21
  • wordToken []= " " wordToken (x:y:z) | x==',' || x=='.' = " "(++)x:wordToken( y:z) | x/='\n' && y/='\n'= " "(++)x(++)y(++)wordToken z | x/='\n' && y=='\n'= " "(++)x:wordToken z | otherwise = wordToken z Commented Apr 17, 2020 at 21:21
  • Please place you attempt in the question section. Note what it does instead of what you expect it to do. Commented Apr 17, 2020 at 21:22

1 Answer 1

1

To improve upon your idea, I suggest using a helper function with an accumulator, that stores the current characters until the next seperator. As soon as you reach either the end of the string or another seperator, you add the accumulated new word to the list and reset the accumulator to "".

wordToken :: String -> [String]
wordToken "" = [] -- empty list
wordToken str = helper str "" -- start helper with empty current word
    where helper :: String -> String -> [String]
          -- when the entire string is consumed
          helper "" ""      = [] -- if no current word, append nothing
          helper "" current = [current] -- if current word, append this to the list
          -- otherwise
          helper (x:xs) current
              | x == ',' || x == '.' = current : [x] : helper xs "" -- add comma or fullstop as extra word
              | x == ' '             = current : helper xs "" -- but skip on whitespaces
              | otherwise            = helper xs (current ++ [x]) -- if no seperator, just continue building up the current word

This results in the expected output:

wordToken "the man saw."
> ["the", "man", "saw", "."]
Sign up to request clarification or add additional context in comments.

6 Comments

is it possible I can delete my question?
This is generally not welcomed, as the people that answered your question usually put an effort in solving your problem.
Yes but this is my first time here so I didnt know about this rule, Im sorry for the inconvenience but because I just realized if I posted my code it could be considered as plagiarism for my project
No worries. You did not post your final code and at the same time, it is okay to look something up.
Is it possible you delete your answer since its the only way I can delete it ? @Erich
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.