1

I have n (n between 1 and 100) sorted number arrays, each with m elements (m around 1000 in my case). I want to merge them into a single sorted array.

I can think of two possibilities for doing this:

1.Use a two arrays merging algo (like merge() function below from http://www.nczonline.net/blog/2012/10/02/computer-science-and-javascript-merge-sort/) and applying it iteratively (1st and 2nd, then merge of 1st-2nd and 3rd, etc)

  function merge(left, right) {
      var result  = [],
        il      = 0,
        ir      = 0;
      while (il < left.length && ir < right.length){
        if (left[il] < right[ir]){
            result.push(left[il++]);
        } else {
            result.push(right[ir++]);
        }
    }
    return result.concat(left.slice(il)).concat(right.slice(ir));
}
  1. Generalize merge() function to n arrays simultaneously. At each iteration, I would pick the min value of the n first values not yet processed and append it to the result.

Are these two algo equivalent in terms of complexity ? I have the feeling that both algo are in o(m*n). Am I right ?

Are there any performance consideration to take one algo rather than the other ? I have the feeling that 1 is simpler than 2.

2 Answers 2

3

Merge n arrays using priority queue (based on binary heap, for example).
Overall element count is m*n, so algorithm complexity is O(m * n * Log(n)). algorithm sketch:

Add numbers 1..n to priority queue, using 1st element of every 
array as sorting key 
(you may also use pairs (first element/array number).
At every step - 
  J = pop_minimum
  add current head of Jth array to result
  move head of Jth array to the right
  if Jth array is not exhausted, insert J in queue (with new sorting key)

1st algoritm complexity is

2*m + 3*m+ 4*m+...+n*m = m * (n*(n-1)/2-1) =  O(n^2 * m)
Sign up to request clarification or add additional context in comments.

5 Comments

not sure to understand your answer: the n arrays are initially sorted. Why using a binary heap ?
Yes, I took into account. Priority queue is needed to select current minimum from n array tails. I'll write short description
I'll do some googling because I don't see how to apply binary heap.
Ok, I've got it. The Log(n) multiplicative term in the complexity is the cost for maintaining the queue sorted using binary heap, right ?
A heap is indeed what you want. You're not using the heap to sort the arrays; you're using it to keep track of which array to pick the next element from.
1

That's an old question, but for the sake of posterity:

Both algos are indeed O(n*m). In algo 1, you have to remerge for each m array. In algo 2, you do just one big merge, but picking out the minimum from m arrays is still linear.

What I did instead was implement a modified version of merge sort to get O(mlogn).

The code is there on GitHub https://github.com/jairemix/merge-sorted if anyone needs it.

 

Here's how it works

The idea is to modify algo 1 and merge each array pairwise instead of linearly.

So in the first iteration you would merge array1 with array2, array3 with array4, etc.

Then in the second iteration, you would merge array1+array2 with array3+array4, array5+array6 with array7+array8, etc.

For example:

// starting with:
[1, 8], [4, 14], [2, 5], [3, 7], [0, 6], [10, 12], [9, 15], [11, 13]

  // after iteration 1:
  [1, 4, 8, 14],  [2, 3, 5, 7],   [0, 6, 10, 12],   [9, 11, 13, 15]

    // after iteration 2
    [1, 2, 3, 4, 5, 7, 8, 14],      [0, 6, 9, 10, 11, 12, 13, 15]

        // after iteration 3
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

In JS:

function mergeSortedArrays(arrays) {

  // while there are still unmerged arrays
  while (arrays.length > 1) {
    const result = [];

    // merge arrays in pairs
    for (let i = 0; i < arrays.length; i += 2) {
      const a1 = arrays[i];
      const a2 = arrays[i + 1];

      // a2 can be undefined if arrays.length is odd, so let's do a check
      const mergedPair = a2 ? merge2SortedArrays(a1, a2) : a1;
      result.push(mergedPair);
    }

    arrays = result;
  }

  // handle the case where no arrays is input
  return arrays.length === 1 ? arrays[0] : [];

}

Notice the similarity to merge sort. In fact in merge sort, the only difference is that n = m, so you're starting further back with m presorted arrays of 1 item each. Hence the O(mlogm) complexity of merge sort.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.