Why does the JDK implementation of quicksort risk a stack overflow?

Question

I saw a presentation the other day in which the speaker had used the technique outlined in McIlroy's paper A Killer Adversary for Quicksort to generate an input to Arrays.sort for primitive types that would trigger O(n²) behavior. The sequence caused the pivot choice to always only reduce the array size by a constant, which caused the Java Arrays.sort function to cause a stack overflow.

According to the source files from the JDK, the Arrays.sort1 quicksort implementation function has no guarding to prevent stack overflows. It is always possible to make quicksort never stack overflow by having the sorting routine not fire off two recursive calls, but instead use a while loop to reuse the current stack frame for the larger subarray and only recursing once (on the smaller subarray). This causes minimal performance degradation and makes it impossible to cause a stack overflow for any reasonably-sized input, since the stack depth never exceeds O(log n) stack frames on an input of size n. The authors also could have used the introsort algorithm, which modifies quicksort to switch to a worst-case O(n log n) sorting algorithm when the quicksort recursion depth exceeds some limit, to prevent this.

Is there any reason why the authors of Arrays.sort didn't opt to do this? It seems like a serious problem that a built-in sorting algorithm can cause a stack overflow, as it makes it possible to launch a DoS attack against such a system by triggering repeated stack overflows.

To know for sure we need to ask Vladimir Yaroslavskiy, Jon Bentley or Josh Bloch. However in java 1.7 the sort1 method is removed and replaced with a DualPivotQuicksort but I am not good enough at this stuff to understand if this is better as the old approach. — mszalbach
– mszalbach, Commented May 10, 2013 at 23:35

OldCurmudgeon · Accepted Answer · 2013-05-12 21:09:05Z

5

Why? Because solving the problem would be overkill.

The algorithm used will be stable in all but exceptionally unusual circumstances and if those circumstances are more than usually likely to occurr then the situation will be guarded against externally. That is why they have API documentation that defines the algorithm used behind the scenes. So you can defend against it.

The chances of the specific order that breaks the algorithm being presented is vanishingly small.

I expect if you looked carefully enough there would be datasets that cause almost all of the standard JVM structures to break. What is the cost of protecting against them and is that cost worth the effort and the inevitable degradation of the algorithm due to the defensive measures.

edited May 12, 2013 at 21:09

answered May 11, 2013 at 0:04

OldCurmudgeon

66k18 gold badges126 silver badges220 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Why does the JDK implementation of quicksort risk a stack overflow?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related