1

I'm writing a quicksort algorithm to sort an array of strings.

The problem is that my array with the data seem to be overwritten with something right after i allocate the right and left quicksort arrays, because i print the array and its all there, but after i use malloc to allocate the others arrays, i print it again and i'm missing some elements.

Here's the output:

Pivot: 2
Emma, Olivia, Victoria, Gwyneth, Chloe, Hayley, Scarlett,
Emma, Olivia, Victoria, Gwyneth, , , ,

Anyone knows whats happening? What am missing?

char **concatenate(char **array1, int n1, char *pivot, char **array2, int n2, int len){
int i=0, j=0;
int elements = n1 + n2 + 1;

// alocating array
char **concat = (char**) malloc(sizeof(*concat) * elements);
concat[0] = (char*) malloc(sizeof(*concat) * elements * len);
for(i=1; i<elements; i++)
    concat[i] = &(concat[0][i*len]);

// concatenating 
for(i=0; i<n1; i++)
    concat[i] = array1[i];
concat[i++] = pivot;
for(j=0; j<n2; j++)
    concat[i++] = array2[j];

// returning
return concat;
}

char **quicksort(char **array, int elements, int len){
// array is already sorted
if(elements < 2)
    return array;

int pivot;
int i=0, l=0, r=0;

// selecting the pivot (median)
if(elements % 2 == 0)
    pivot = ((elements + 1) / 2) -1;
else
    pivot = (elements / 2) -1;

//REMOVE
printf("Pivot: %d\n", pivot);
for(i=0; i<elements; i++)
    printf("%s, ", array[i]);
printf("\n");

// alocating arrays
char **left = (char**) malloc(sizeof(*left) * pivot);
left[0] = (char*) malloc(sizeof(*left) * pivot * len);
for(i=1; i<pivot; i++)
    left[i] = &(left[0][i*len]);

char **rigth = (char**) malloc(sizeof(*rigth) * pivot);
rigth[0] = (char*) malloc(sizeof(*rigth) * pivot * len);
for(i=1; i<pivot; i++)
    rigth[i] = &(rigth[0][i*len]);

//REMOVE
for(i=0; i<elements; i++)
    printf("%s, ", array[i]);
printf("\n");

//quicksorting
for(i=0; i<elements; i++){
    if(array[i] == array[pivot])
        continue;

    int comp = strcmp(array[i], array[pivot]);

    //REMOVE
    printf("%d: strcmp %s, %s is %d\n", i, array[i], array[pivot], comp);

    if(comp < pivot)
        left[l++] = array[i];
    else
        rigth[r++] = array[i];
}

//REMOVE
printf("concatenate(");
for(i=0; i<l; i++)
    printf("%s ", left[i]);
printf("|%s| ", array[pivot]);
for(i=0; i<r; i++)
    printf("%s ", rigth[i]);
printf(")\n");

// recursion and return
return concatenate(quicksort(left, l, len), l, array[pivot], quicksort(rigth, r, len), r, len);
}

int main(int argc, char *argv[]){
int i, j, aux;                  

char **teste = (char**) malloc(sizeof(*teste) * 7);
teste[0] = (char*) malloc(sizeof(*teste) * 7 * 128);
for(i=1; i<7; i++)
    teste[i] = &(teste[0][i*128]);
teste[0] = "Emma";
teste[1] = "Olivia";
teste[2] = "Victoria";
teste[3] = "Gwyneth";
teste[4] = "Chloe";
teste[5] = "Hayley";
teste[6] = "Scarlett";

quicksort(teste, 7, 128);

printf("AFTER\n");
for(i=0; i<7; i++)
    printf("%s, ", teste[i]);
printf("\n");

return 0;
}
9
  • You're allocating an array sized for ints but you're storing pointers in it. Use char **concat = malloc(sizeof(*concat) * elements); instead. You'll have to make similar changes for all of the malloc calls. (You also don't need to allocate memory for Quick Sort. One of its strengths is that it can sort in place.) Commented Oct 26, 2013 at 21:40
  • Compile with warnings turned on ( -Wall in gcc ) and fix the warnings before running your code. Commented Oct 26, 2013 at 21:42
  • @pburka i've changed all the malloc call but i'm still missing some elements on the same spot. Commented Oct 26, 2013 at 21:49
  • @bfagundes Do you need to write your own quicksort, or can you use the standard library function qsort()? Commented Oct 26, 2013 at 21:52
  • 1
    Note that your mallocs are still incorrect, but you can nearly always get away with what you've done. concat[0] = (char*) malloc(sizeof(*concat) * elements * len); should be concat[0] = malloc(sizeof(*concat[0]) * elements * len);. Commented Oct 26, 2013 at 23:28

1 Answer 1

8

There is zero reason to allocate for quicksort, and in fact the function can easily suffice in your case with a simple interface of quicksort(char *arr[], unsigned int len), using pointer-math for the subsequence invocations.

Provide a swap algorithm for exchanging pointers:

void swap_str_ptrs(char const **arg1, char const **arg2)
{
    const char *tmp = *arg1;
    *arg1 = *arg2;
    *arg2 = tmp;
}

Then the algorithm is:

void quicksort_strs(char const *args[], unsigned int len)
{
    unsigned int i, pvt=0;

    if (len <= 1)
        return;

    // swap a randomly selected value to the last node
    swap_str_ptrs(args+((unsigned int)rand() % len), args+len-1);

    // reset the pivot index to zero, then scan
    for (i=0;i<len-1;++i)
    {
        if (strcmp(args[i], args[len-1]) < 0)
            swap_str_ptrs(args+i, args+pvt++);
    }

    // move the pivot value into its place
    swap_str_ptrs(args+pvt, args+len-1);

    // and invoke on the subsequences. does NOT include the pivot-slot
    quicksort_strs(args, pvt++);
    quicksort_strs(args+pvt, len - pvt);
}

Thats everything. including the partitioning.

How It Works

There are two general recursive quicksort algorithms: the squeeze, and the sweep. This is the sweep algorithm. We march up the sequence, swapping any element "less" than than pivot value (which is swapped to the end of the sequence before the loop starts) to a target slot, the index of which is initially the beginning of the sequence and increases with each swap operation. When the "sweep" is finished, the pvt index is where the pivot value belongs, as everything below that slot is "less" than the that value. So one more swap is made to put the pivot value into position. After that we have two partitions, which are recursed. It is vital that the slot we just identified as the pivot location is not included in either of those partitions. It is the only value we know is in its final resting place.

Test Harnass

Including the above code, we test this with a basic set of strings purposely out of order:

void print_list(char const *args[], unsigned len)
{
    unsigned i=0;
    for (;i<len;++i)
        puts(args[i]);
}

int main()
{
    char const *args[] =
    {
        "this", "is", "a", "test", "of", "quicksort", "with", "strings"
    };

    srand((unsigned)time(NULL));
    quicksort_strs(args, sizeof(args)/sizeof(*args));
    print_list(args, sizeof(args)/sizeof(*args));
    return 0;
}

Output

a
is
of
quicksort
strings
test
this
with

Non-recursive implementation

It should be noted that the above algorithm lends itself beautifully to a non-recursive implementation. A local dynamic stack is used for holding pairs of data: an pointer and a length. Optimized to not push trivial segments (segments of length 1 or 0) on to the stack, one implementation would like like this:

void quicksort_strs(char const *args[], unsigned int len)
{
    // holds our non-recursive stack of segments
    struct segment
    {
        char const **arr;
        unsigned int len;
        struct segment* next;
    } *stack = NULL;

    stack = malloc(sizeof(*stack));
    stack->arr = args;
    stack->len = len;
    stack->next = NULL;

    while (stack != NULL)
    {
        unsigned int i, pvt=0;
        struct segment *tmp = stack;
        stack = stack->next;

        // pull values and delete segment record
        args = tmp->arr;
        len = tmp->len;
        free(tmp);

        // nothing to unary segments
        if (len <= 1)
            continue;

        // swap a randomly selected value to the last node
        swap_str_ptrs(args+((unsigned int)rand() % len), args+len-1);

        // reset the pivot index to zero, then scan
        for (i=0;i<len-1;++i)
        {
            if (strcmp(args[i], args[len-1]) < 0)
                swap_str_ptrs(args+i, args+pvt++);
        }

        // move the pivot value into its place
        swap_str_ptrs(args+pvt, args+len-1);

        // lhs segment push
        if (pvt > 1)
        {
            tmp = malloc(sizeof(*tmp));
            tmp->arr = args;
            tmp->len = pvt;
            tmp->next = stack;
            stack = tmp;
        }

        // rhs segment push
        if ((len - ++pvt) > 1)
        {
            tmp = malloc(sizeof(*tmp));
            tmp->arr = args+pvt;
            tmp->len = len-pvt;
            tmp->next = stack;
            stack = tmp;
        }
    }
}

Obviously having a canned node-stack implementation would shorten this up considerably, but the idea should be readily apparent. A realloc() schema for holding nodes on the end of the "stack" rather than the beginning would be equally interesting, as it would eliminate the need to next pointer management, replaced with a top index instead.

Anyway, good luck, and I hope it helps.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.