2

I am currently doing reasonably well in functional programming using F#. I tend, however, to do a lot of programming using recursion, when it seems that there are better idioms in the F#/functional programming community. So in the spirit of learning, is there a better/more idiomatic way of writing the function below without recursion?

let rec convert line =
    if line.[0..1] = "  " then
        match convert line.[2..] with
        | (i, subline) -> (i+1, subline)
    else
        (0, line)

with results such as:

> convert "asdf";;
val it : int * string = (0, "asdf")
> convert "  asdf";;
val it : int * string = (1, "asdf")
> convert "      asdf";;
val it : int * string = (3, "asdf")
5
  • You should understand what tail recursion is. Commented Sep 15, 2012 at 7:45
  • Thanks @BasileStarynkevitch. I believe I do understand it. In this case I would use an accumulator for the variable i and convert the call to a tail call. I am, however, looking to see if there is a different way. Probably using some sort of fold or unfold. Commented Sep 15, 2012 at 7:53
  • 1
    Lazy answer - fun x -> (x.Length - x.TrimStart().Length),x Commented Sep 15, 2012 at 8:00
  • Agree (after a division by 2), @JohnPalmer. But you know what I mean:) Commented Sep 15, 2012 at 8:16
  • @JohnPalmer It needs a bit more tuning :-) Muhammad's version counts the number of two-spaces and also returns the string after removing all two-spaces from the beginning (and not just x). But yeah, TrimStart is a .NET way to go - it is just a bit more complex :-). Commented Sep 15, 2012 at 9:18

4 Answers 4

6

Recursion is the basic mechanism for writing loops in functional languages, so if you need to iterate over characters (as you do in your sample), then recursion is what you need.

If you want to improve your code, then you should probably avoid using line.[2..] because that is going to be inefficient (strings are not designed for this kind of processing). It is better to convert the string to a list and then process it:

let convert (line:string) = 
  let rec loop acc line =
    match line with
    | ' '::' '::rest -> loop (acc + 1) rest
    | _ -> (acc, line)
  loop 0 (List.ofSeq line)

You can use various functions from the standard library to implement this in a more shorter way, but they are usually recursive too (you just do not see the recursion!), so I think using functions like Seq.unfold and Seq.fold is still recursive (and it looks way more complex than your code).

A more concise approach using standard libraries is to use the TrimLeft method (see comments), or using standard F# library functions, do something like this:

let convert (line:string) =
  // Count the number of spaces at the beginning
  let spaces = line |> Seq.takeWhile (fun c -> c = ' ') |> Seq.length
  // Divide by two - we want to count & skip two-spaces only
  let count = spaces / 2
  // Get substring starting after all removed two-spaces
  count, line.[(count * 2) ..]

EDIT Regarding the performance of string vs. list processing, the problem is that slicing allocates a new string (because that is how strings are represented on the .NET platform), while slicing a list just changes a reference. Here is a simple test:

let rec countList n s = 
  match s with 
  | x::xs -> countList (n + 1) xs
  | _ -> n

let rec countString n (s:string) =
  if s.Length = 0 then n
  else countString (n + 1) (s.[1 ..])


let l = [ for i in 1 .. 10000 -> 'x' ]
let s = new System.String('x', 10000)

#time 
for i in 0 .. 100 do countList 0 l |> ignore    // 0.002 sec (on my machine)
for i in 0 .. 100 do countString 0 s |> ignore  // 5.720 sec (on my machine)
Sign up to request clarification or add additional context in comments.

2 Comments

I probably should take a look at how string slicing is done. I am surprised it is not efficient. My intuition is that slicing of immutable strings should be a very cheap O(1) operation. Any reason it is not?
@MuhammadAlkarouri See the edit for more information on performance.
2

Because you traverse the string in a non-uniform way, a recursive solution is much more suitable in this example. I would rewrite your tail-recursive solution for readability as follows:

let convert (line: string) =
    let rec loop i line =
        match line.[0..1] with
        | "  " -> loop (i+1) line.[2..]
        | _ -> i, line
    loop 0 line

Since you asked, here is a (bizarre) non-recursive solution :).

let convert (line: string) = 
    (0, line) |> Seq.unfold (fun (i, line) -> 
                                let subline = line.[2..]
                                match line.[0..1] with
                                | "  " -> Some((i+1, subline), (i+1, subline))
                                | _ -> None)
              |> Seq.fold (fun _ x -> x) (0, line)

6 Comments

Thanks. The second solution is essentially my answer, in a more refined form. For the first, this is what I am looking for. A question though: you say I traverse the string in a uniform way, how? What type of problems are better written as you did with an unfold followed by a fold?
The version using unfold and fold looks a bit scary :-)
The fold bit could be written shorter by using Seq.reduce or even in new versions Seq.last which is what the last line does anyway.
@MuhammadAlkarouri: No. Seq.fold doesn't fail on empty sequences while Seq.reduce and Seq.last do.
@TomasPetricek: So it serves the purpose well :). I'm trying to convince that recursive solutions are better in this case.
|
1

Using tail recursion, it can be written as

let rec convert_ acc line =
    if line.[0..1] <> "  " then
        (acc, line)
    else
        convert_ (acc + 1) line.[2..]
let convert = convert_ 0

still looking for a non-recursive answer, though.

Comments

0

Here's a faster way to write your function -- it checks the characters explicitly instead of using string slicing (which, as Tomas said, is slow); it's also tail-recursive. Finally, it uses a StringBuilder to create the "filtered" string, which will provide better performance once your input string reaches a decent length (though it'd be a bit slower for very small strings due to the overhead of creating the StringBuilder).

let convert' str =
    let strLen = String.length str
    let sb = System.Text.StringBuilder strLen

    let rec convertRec (count, idx) =
        match strLen - idx with
        | 0 ->
            count, sb.ToString ()

        | 1 ->
            // Append the last character in the string to the StringBuilder.
            sb.Append str.[idx] |> ignore
            convertRec (count, idx + 1)

        | _ ->
            if str.[idx] = ' ' && str.[idx + 1] = ' ' then
                convertRec (count + 1, idx + 2)
            else
                sb.Append str.[idx] |> ignore
                convertRec (count, idx + 1)

    // Call the internal, recursive implementation.
    convertRec (0, 0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.