In this answer on CodeReview, the question asker and the answerer both seem to show disdain for the (++) operator. Is this due to it's speed (causing the algorithm to explicitly run in O(n^2) where n is the length of the list iirc)? Is this a pre-optimization if not otherwise tested, as Haskell is known for being difficult to reason about time complexity? Should others avoid the (++) operator in their programs?
-
Haskell lists are linked lists, and appending lists is an O(n) operation in strict languages. In Haskell, however, it’s not quite as straightforward: laziness allows that work to be put off until the list is used, effectively distributing the cost over the length of the list. As you mention, this laziness can make time and space complexities difficult to reason about in Haskell, but I think the same advice for other languages applies to Haskell, too: write for clarity first, and optimize for speed if and only if you both need it to go faster and you’ve profiled to find a bottleneck.Alexis King– Alexis King2016-07-28 05:22:29 +00:00Commented Jul 28, 2016 at 5:22
-
it's both - it's harder to reason about your code, it usually slows down your code quite a bit and you can very often work around using it - but it all depends on the circumstances and IMO your questions is to unspecific to be a good fit for SORandom Dev– Random Dev2016-07-28 05:22:32 +00:00Commented Jul 28, 2016 at 5:22
-
@Carsten: I'm a little confused about your remark regarding "it usually slows down your code". While this is certainly true with the canonical tail-recursive factorial example, is this really quite common amongst everyday programs? I thought the benefit to something like Haskell over one of the more popular outsider languages like Ruby and Python was the optimizations afforded by the rich typing system. If you really believe that my questions are not specific enough, what about in the example I provided, save only the last question (which was my intention, and I think ThreeFx figured it out.Eliza Brandt– Eliza Brandt2016-07-28 06:09:02 +00:00Commented Jul 28, 2016 at 6:09
-
you should not worry if I think the question is no fit for SO - it's not about the question but about SORandom Dev– Random Dev2016-07-28 06:37:10 +00:00Commented Jul 28, 2016 at 6:37
-
I'm not worrying, I just think it is a fit for SO. ThreeFx was able to help me significantly with that, and I hope many others as well.Eliza Brandt– Eliza Brandt2016-07-28 06:43:04 +00:00Commented Jul 28, 2016 at 6:43
1 Answer
It depends.
Consider the expression
foldl (++) [] list
This expression concatenates a list of lists into a single list, but has aforementioned quadratic complexity. This happens because the implementation of (++) traverses the entire left list and prepends each element to the right list (while keeping the correct order of course).
Using a right fold, we get linear complexity:
foldr (++) [] list
This is due to the (++) operator's implementation, which traverses only the left argument and prepends it to the right.
[1,2] ++ [3,4] ++ [5,6]
is equal to
-- Example as created by foldr
[1,2] ++ ([3,4] ++ [5,6])
== [1,2] ++ [3,4,5,6]
== [1,2,3,4,5,6] -- all good, no element traversed more than once
which only traverses each list element once.
Now switching the parentheses around to the first two lists is more expensive, since now some elements are traversed multiple times, which is inefficient.
-- Example as created by foldl
([1,2] ++ [3,4]) ++ [5,6]
== [1,2,3,4] ++ [5,6]
== [1,2,3,4,5,6] -- the sublist [1,2] was traversed twice due to the ordering of the appends
All in all, watch out for such cases and you should be fine.
7 Comments
(^) (or (-)): These operators produce different results when used in left and right folds, respectively. It depends on what you want it to do. foldl' is an entirely different matter, it is a strict version of the standard foldl and should almost always be preferred over foldl, except when you explicitly want to build up thunks.foldr is better has nothing to do with the associativity of (++). It's because of how (++) has to work when concatenating immutable linked lists. It must create new (:) constructors for each element in its left argument, but it doesn't do anything at all with the right argument. Therefore, if you stack up a left-biased tree of calls to (++) it repeatedly copies the same elements. If you stack up a right-biased tree, it only copies the elements it needs to once.