2

I fear I've missed something "standard" but I did try, honest!

I have been trying to understand how to implement efficient, tail-recursive, operations on linked lists. (I'm writing in Scala, but I doubt that's actually relevant).

Note: I am investigating this from a algorithms theory perspective, I'm not interested in "use the prebuilt library, it's already worked this out" :)

So, if I perform a trivial filter implementation to act on a list:

  def filter[T](l: List[T])(f: T => Boolean): List[T] = l match {
      case h :: t => if (f(h)) h :: filter(t)(f) else filter(t)(f)
      case _ => List()
  }

this isn't tail recursive, but it's tolerably efficient (in so far as it only prepends new items onto the result list it constructs)

However, the simple tail-recursive variant I came up with:

  def filter[T](l: List[T])(f: T => Boolean): List[T] = {
    @tailrec
    def filterAcc[T](l: List[T], f: T => Boolean, acc: List[T]): List[T] = l match {
      case List() => acc
      case h :: t => if (f(h)) filterAcc(t, f, h :: acc) else filterAcc(t, f, acc)
    }
    filterAcc(l, f, List())
  }

reverses the order of the items. (Which is not a surprise, of course!)

I could fix the order, of course, by appending the filtered item to the accumulator, but that would make this an O(n^2) implementation I believe (as each append would force building an entirely new list which is an O(n) operation on a Scala immutable list, times n repetitions for the n elements in the list)

I could also fix this by calling reverse on the generated list. I appreciate that this would be a single O(n) operation, and so the overall time complexity is still O(n), but it seems ugly.

So, my question is simply this; is there a solution that is tail-recursive, and accumulates in the correct order from the outset, that's O(n) and likely involves less work than the "catch it backwards and reverse it" option? Did I miss something (which is normal for me, I fear :( )

3
  • 2
    Reversing afterwards is fairly standard. Not sure what's "ugly" about it that you see. Alternatively, use something like Queue or Vector as they offer (mostly) constant-time append. Commented Feb 6, 2018 at 11:31
  • Well, if it weren't necessary, it would be ugly :) But if it's necessary then so be it! Commented Feb 6, 2018 at 12:12
  • 1
    You can see it like this. The filtering itself requires a single traversal (if you had a single-linked-list with mutable links, you could append to the end, and this single traversal would be enough). But then, you want to build an immutable data structure as the result. The second traversal (for reversing) can be seen as the price to be paid for having a nice, immutable, thread-safe, persistent data structure as the result. So, it's usually worth it. Commented Feb 6, 2018 at 12:17

3 Answers 3

2

The reason why you can't avoid reverse is because the standard library List is a Linked List with pointer to the head : it is perfectly possible to implement your own list with pointer to the tail and avoid calling reverse.

However, because this will not bring any improvement from the algorithmic standpoint it doesn't make much sense nor to write this list yourself, neither to have it included in the standard library

Sign up to request clarification or add additional context in comments.

Comments

1

No, you are not missing anything. It's completely normal to accumulate some results and then reverse in the end. If you don't like it, then you can try to express your calculation by some combination of standard operations like foldLeft, map, flatMap, filter etc.

That being said... If you forget about private modifiers and immutability, then you actually can write a tail-recursive filter, but it's really not pretty:

import scala.annotation.tailrec

def filter[T](l: List[T])(f: T => Boolean): List[T] = {

  val tailField = classOf[::[_]].getDeclaredField("tl")
  tailField.setAccessible(true)

  /* Appends a value in constant time by 
   * force-overriding the tail element of the first cons 
   * of the list `as`. If `as` is `Nil`, returns a new cons.
   *
   * @return the last cons of the new list
   */
  def forceAppend[A](as: List[A], lastCons: List[A], a: A): (List[A], List[A]) = as match {
    case Nil => {
      val newCons = a :: Nil
      (newCons, newCons) // not the same as (a :: Nil, a :: Nil) in this case!!!
    }
    case _ => {
      val newLast = a :: Nil
      tailField.set(lastCons, newLast)
      (as, newLast)
    }
  }

  @tailrec
  def filterAcc[T](l: List[T], f: T => Boolean, acc: List[T], lastCons: List[T]): List[T] = {
    l match {
      case List() => acc
      case h :: t => if (f(h)) {
        val (nextAcc, nextLastCons) = forceAppend(acc, lastCons, h) 
        filterAcc(t, f, nextAcc, nextLastCons)
      } else {
        filterAcc(t, f, acc, lastCons)
      }
    }
  }
  filterAcc(l, f, Nil, Nil)
}

val list = List("hello", "world", "blah", "foo", "bar", "baz")
val filtered = filter(list)(_.contains('o'))

println(filtered)

What happens here is: we simply pretend that we are writing code in C, and that we want to work directly with references from which the data structure is built. This allows us to keep a reference to the last cons of the list, and then to override the pointer to the next cons, instead of prepending to the head. This temporarily breaks the immutability, but it's even more-or-less ok in this case, because the accumulator does not leak to the outside while we are building it.

I highly doubt that this is any faster than the straight-forward implementation. Quite the contrary: it might actually be slower, because the code is more complicated, and harder for the compiler to optimize.

1 Comment

This makes good sense, I vaguely recall reading that the Scala List does some kind of copy on write trick to allow it to be mutable, and tail appendable in some way / situation, that improved something. I suspect this "dirty linen isn't too bad if you keep it hidden" kind of thing is perhaps on similar philosophical lines. But if I didn't miss the obvious/standard, I'm happy.
0

Perhaps use the :+ syntax to append head instead.

 def filter[T](l: List[T])(f: T => Boolean): List[T] = {
    @tailrec
    def filterAcc(l: List[T], f: T => Boolean, acc: List[T]): List[T] = l match {
      case Nil => acc
      case h :: t => if (f(h)) filterAcc(t, f, acc :+ h) else filterAcc(t, f, acc)
}

filterAcc(l, f, List())

}

4 Comments

This is inefficient - :+ is O(n) and if you're doing it n times the whole algorithm is O(n^2)
That is true. Given that Scala uses a singely linked list as it immutable list implementation there is no way of achieving a O(n) algorithm. Nice question.. sorry I didnt read it fully.
Perhaps you could create your own implementation of List as a doubely linked list or as a circular linked list. That would make an O(n) algorith possible.
Double linked list are impossible in immutable world

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.