3

I'm trying to write OpenMP solution for Insertion sort but I'm having problems to make it run in parallel and give correct results :). Is there any way to make Insertion sort it run in parallel.

Here is my code:

void insertionsort(int *A, int num)
{
    // clock_t start, stop;
    // 
    // start=clock();
    int k;
    #pragma omp parallel for shared(A) private(k)
    for(int n = 1; n < num; n++)
    {
        int key = A[n];
        k = n;
        #pragma omp critical
        for(;k>0 && A[k-1]> key;k--)
        {
            A[k] = A[k-1];   
        }
        A[k] = key;  
    }
// stop=clock();
// cas = (double)(stop-start)/CLOCKS_PER_SEC;
}
1
  • Using clock() to measure the run time of a threaded program is wrong. It gives you the accumulated CPU time, which means the CPU time of all threads. Use omp_get_wtime() instead. On the topic, this critical region would basically serialise your "parallel" loop as the amount of work outside it is practically none. Commented Dec 16, 2012 at 22:25

1 Answer 1

7

You should not parallelize the Insertion Sort algorithm in this manner. From the inner loop condition A[k-1]> key; one can see that this algorithm assumes that for a given key in the position k of the array, if the actual key is bigger than the keys stored on the previous position of the array the swapping of elements stops. Hence, this algorithm assumes that the keys on the positions before k are already sorted.

When you introduce parallelism, for instance, with two threads, threads 0 and 1 start from the beginning and middle of the array, respectively. However, the first half of the array is not sorted, according to the (wrong) assumption made by the algorithm, which consequently, leads to issues.

Let me illustrate the problem, let us sort the array = [-1,2,-3,4,-5,6,-7,8] with 2 threads: Let us fix a given execution ordered (in reality it is non-deterministic)

    1. Thread 0 takes k = 1 and key = 2; array status [-1,2,-3,4,-5,6,-7,8]
    1. Thread 1 takes k = 5 and key = 6; array status [-1,2,-3,4,-5,6,-7,8]
    1. Thread 0 takes k = 2 and key = -3; array status [-3,-1,2,4,-5,6,-7,8]
    1. Thread 1 takes k = 6 and key = -7; array status [-7,-3,-1,2,4,-5,6,8]
    1. Thread 0 takes k = 3 and key = 2; array status [-7,-3,-1,2,4,-5,6,8]
    1. Thread 1 takes k = 7 and key = 8; array status [-7,-3,-1,2,4,-5,6,8]
    1. Thread 0 takes k = 4 and key = 4; array status [-7,-3,-1,2,4,-5,6,8]

Final result : [-7,-3,-1,2,4,-5,6,8]

On line 4 thread 1 takes the key -7 from position 6 and puts it in the end of the array sifting all its elements from positions 1 to 6 (included) one position to the right, so now -5 is on the old position of -7. Since, the old position of -7 (6) will never be compared again -5 will stay there untouchable. Therefore, making the array unsorted.

An easy, albeit poor solution, would be to add the OpenMP ordered clause to the parallel for construct. However, using this constructor would basically sequentialize your code.

Another possible solution, although I am not 100% sure it fits your needs, would be to make your algorithm parallel by Regular Sampling. You can see here an example of this latter technique apply on quicksort.

The struct of your algorithm is not the most suitable to be parallelize directly and achieve a good speedups. Since each iteration of the inner loop is interdependent, this will required the use of methods to unsure mutual exclusion, thus resulting in synchronization overhead. You have far better sorting algorithm that you can directly parallelize, typically the ones that use a divide-and-conquer strategy, for instance Radix Sort or Quick Sort, among others.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.