4

My goal is to find the number of times a substring exists within a string. The substring I'm looking for will be of type "[n]", where n can be any variable.

My attempt involved splitting the string up using the words function, then create a new list of strings if the 'head' of a string was '[' and the 'last' of the same string was ']'

The problem I ran into was that I entered a String which when split using the function words, created a String that looked like this "[2]," Now, I still want this to count as an occurrence of the type "[n]"

An example would be I would want this String,

asdf[1]jkl[2]asdf[1]jkl

to return 3.

Here's the code I have:

-- String that will be tested on references function
txt :: String
txt = "[1] and [2] both feature characters who will do whatever it takes to " ++
  "get to their goal, and in the end the thing they want the most ends " ++
  "up destroying them.  In case of [2], this is a whale..."

-- Function that will take a list of Strings and return a list that contains
-- any String of the type [n], where n is an variable
ref :: [String] -> [String]
ref [] = []
ref xs = [x | x <- xs, head x == '[', last x == ']']

-- Function takes a text with references in the format [n] and returns
-- the total number of references.
-- Example :  ghci> references txt -- -> 3
references :: String -> Integer   
references txt = len (ref (words txt))

If anyone can enlighten me on how to search for a substring within a string or how to parse a string given a substring, that would be greatly appreciated.

3 Answers 3

4

I would just use a regular expression, and write it like this:

import Text.Regex.Posix

txt :: String
txt = "[1] and [2] both feature characters who will do whatever it takes to " ++
  "get to their goal, and in the end the thing they want the most ends " ++
  "up destroying them.  In case of [2], this is a whale..."


-- references counts the number of references in the input string
references :: String -> Int
references str = str =~ "\\[[0-9]*\\]"

main = putStrLn $ show $ references txt -- outputs 3
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks jcarpenter! Would you mind explaining what the =~ operator does? Is that a part of the imported library? I was moreso hoping to figure out how to parse whenever [n] occurs because I would like to eventually replace each [n] with a String that is in a list indexed by whatever n is.
I don't know how =~ works internally. It matches a regex against a string, and it can return a variety of different types. Google or other people can elaborate on it better than I can.
2

regex is huge overkill for such a simple problem.

references = length . consume

consume []       = []
consume ('[':xs) = let (v,rest) = consume' xs in v:consume rest
consume (_  :xs) = consume xs

consume' []       = ([], []) 
consume' (']':xs) = ([], xs)
consume' (x  :xs) = let (v,rest) = consume' xs in (x:v, rest)

consume waits for a [ , then calls consume', which gathers everything until a ].

1 Comment

I prefer this over the other answer because A) it's a succinct solution in Haskell, rather than in Regex; and B) this probably makes it easier to understand and modify for OP's use case.
0

Here's a solution with sepCap.

import Replace.Megaparsec
import Text.Megaparsec
import Text.Megaparsec.Char
import Data.Either
import Data.Maybe

txt = "[1] and [2] both feature characters who will do whatever it takes to " ++
  "get to their goal, and in the end the thing they want the most ends " ++
  "up destroying them.  In case of [2], this is a whale..."

pattern = single '[' *> anySingle <* single ']' :: Parsec Void String Char
length $ rights $ fromJust $ parseMaybe (sepCap pattern) txt
3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.