2

I know how to sort arrays in javascript by using Array.sort(), but I don't know what is really happening at each iteration in the below code. In this code at certain points some duplicate elements begin to appear in the array. I've searched many websites for an explanation to this mystery, but I don't understand why this is happening. Run this code in chrome Browser. I guess its running on browser's native code. But I'd like to know the logic behind this.

I know the sorting is based on the returning of positive negative and zero values, but why do duplicate elements appear?

var arr = [4, 3, 2, 1];
arr.sort(function(a, b) {
  document.getElementById("output").innerHTML += ("[" + arr + "] " + 'a = ' + a + ' b = ' + b + "<br>");
  return a - b;
});
document.getElementById("output").innerHTML += "[" + arr + "]";
<div id="output"></div>

11
  • Did you see this answer: stackoverflow.com/a/1494742 Commented Dec 28, 2016 at 14:21
  • I know its following some sorting algorithms but why the duplicate is appearing ? And in this question i am asking about array element duplication not about the array sorting Commented Dec 28, 2016 at 14:24
  • 1
    The current array index, that's being compared is stored somewhere temp, and then it is swapped with b or a. This is just my guess. There are more variables than a, b, arr. Try the debugger and see all the variables in scope. Commented Dec 28, 2016 at 14:25
  • 1
    Run the same code in IE it shows entirely different flow and it not shows the array while running through the flow. Commented Dec 28, 2016 at 14:26
  • That's right... Commented Dec 28, 2016 at 14:29

2 Answers 2

3

The short answer is this:

With any in place algorithm that operates on an array, there are no guarantees that intermediate values of the array are valid or complete. The way Chrome has chosen to implement their sort algorithm, there are intermediate steps where the array has values duplicated and is not a complete set of the original array, but this is perfectly acceptable because the sort algorithm is not finished yet.

TLDR version:

If you want to know the exact logic behind this, you can look at the source code for the V8 javascript engine that Chrome uses here https://github.com/v8/v8/blob/master/src/js/array.js

In particular for this case

For short (length <= 22) arrays, insertion sort is used for efficiency.

//This function was copied from the link above
function InsertionSort(a, from, to) {
  for (var i = from + 1; i < to; i++) {
    var element = a[i];
    for (var j = i - 1; j >= from; j--) {
      var tmp = a[j];
      var order = comparefn(tmp, element); //This is where your comparison function is called
      if (order > 0) {
        a[j + 1] = tmp;
      } else {
        break;
      }
    }
    a[j + 1] = element;
  }
};

Stepping through this to the first duplicate...

1st comparison: array before comparison [4,3,2,1]

i=1, j=0, tmp=a=4, element=b=3, order=1, arr[1]=tmp=a=4, arr[0]=element=b=3

2nd comparison: array before comparison [3,4,2,1]

i=2, j=1, tmp=a=4, element=b=2, order=2, arr[2]=tmp=a=4

3rd comparison: array before comparison [3,4,4,1]

i=2, j=0, tmp=a=3, element=b=2(this is still in memory from the outer for loop), order=1, arr[1]=tmp=a=3, arr[0]=element=b=2

4th comparison: array before comparison [2,3,4,1]...

From this you can see why values are briefly duplicated in the array because of how the for loops are set up in the insertion sort. It is set up to write to the array a minimal amount of times and search values are kept in memory, so intermediate values of the array do not need to be valid.

Sign up to request clarification or add additional context in comments.

Comments

2

Because default JavaScript sorting is not O(n).

It is most likely sorting with Heapsort or Mergesort (Firefox's SpiderMonkey engine) which both have a worst-case time complexity of O(n log n) Each browsers' engine uses its own variation of one of these sorting algorithms.

C++ integer sorting uses a Quicksort implementation. Which has a best-case of O(n log n) and a worst-case of O(n^2)

Anything with a time complexity above O(n) will start to show more comparisons than the total number of items.

Note: The default sorting algorithms have been reported to be unstable in the past and it is recommended that you may want to implement your own.

Edit

The values may appear duplicated in the array because you are printing the array DURING the sorting. If you notice the b value is the one missing, so it is being stored in a temporary swap variable while being compared to the other values.

Here is a step-wise debug of a Quicksort implementation. You can see what is going on during the critical steps.

String.repeat = function(str, n) {
  return Array.apply(null, Array(n)).map(String.prototype.valueOf, str).join('');
};
var Debugger = {};
Debugger.debugEnabled = true;
Debugger.indentChar = '&emsp;';
Debugger.println = function(label, indentLevel, rest) {
  if (Debugger.debugEnabled === true) {
    document.body.innerHTML += [
      String.repeat(Debugger.indentChar, indentLevel) +
      '<strong>',  label, ':</strong> ' +
      [].slice.call(arguments, 2).join(' ') +
      '<br />'
    ].join('');
  }
};
var QuickSort = {};
QuickSort.sort = function(items) {
  Debugger.println('Initial Array', 0, items); // Print initial array.
  return QuickSort._quickSort(items, 0, items.length - 1);
};
QuickSort._quickSort = function(items, left, right) {
  var index;
  Debugger.println('Before Sort', 1, items); // Print array before sort.
  if (items.length > 1) {
    index = QuickSort._partition(items, left, right);
    if (left < index - 1) {
      QuickSort._quickSort(items, left, index - 1);
    }
    if (index < right) {
      QuickSort._quickSort(items, index, right);
    }
    Debugger.println('After Sort', 1, items); // Print array after sort.
  }
  return items;
};
QuickSort._swap = function(items, firstIndex, secondIndex) {
  var temp = items[firstIndex];
  items[firstIndex] = items[secondIndex];
  Debugger.println('During Swap', 3, items); // Print array during the swap.
  items[secondIndex] = temp;
};
QuickSort._partition = function(items, left, right) {
  var pivot = items[Math.floor((right + left) / 2)], i = left, j = right;
  while (i <= j) {
    while (items[i] < pivot) {
      i++;
    }
    while (items[j] > pivot) {
      j--;
    }
    if (i <= j) {
      Debugger.println('Comparing', 2, items[i], ' (' + i + ') ', items[j], ' (' + j + ')');
      QuickSort._swap(items, i, j); // Print swapped array values.
      i++;
      j--;
    }
  }
  return i;
};

var array = [4, 3, 2, 1];
var result = QuickSort.sort(array);

Debugger.println('Result Array', 0, result);  // Print result array.

4 Comments

Most standard implementations of sort use insertion sort once the array size is around less than 14 since most O(n log n) algorithms use recursive calls. Also, I think you missed the question's point.
Not really... "Anything with a time complexity above O(n) will start to show more comparisons than the total number of items." — meaning that you will see multiple comparisons printing out.
Yeah, definitely, but OP was wondering why numbers show up multiple times when you run the code snippet. See the third array [3,4,4,1] even though the original array was [4,3,2,1] (running chrome)
@JoshDawson I added an example using Quicksort to show why the array may have duplicates.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.