sort and remove duplicates from int array in c

Question

I am learning C and came over the topic of sorting. I wrote a comp() function in and used qsort to sort an array of int. Now for the next task I need to remove the duplicates from the array.
Is it possible to sort and remove duplicates at the same time?

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>    
int indexes[10] = { 0, 98, 45, 65, 45, 98, 78, 56, 65, 45 };

int comp(const void * elem1, const void * elem2) {

    int f = *((int*) elem1);
    int s = *((int*) elem2);

    if (f > s) {    
        return 1;
    }    
    if (f < s) {    
        return -1;
    }    
    return 0;
}

void printIndexArray() {    
    int i = 0;    
    for (i = 0; i < 10; i++) {    
        printf("i is %d\n", indexes[i]);    
    }
}

int main() {    
    qsort(indexes, sizeof(indexes) / sizeof(int), sizeof(int), comp);    
    printIndexArray();    
    return 0;
}

You are using inbuild function for sorting, write your own function. — Grijesh Chauhan
– Grijesh Chauhan, Commented Sep 20, 2013 at 19:58

StilesCrisis · Accepted Answer · 2013-09-22 12:10:12Z

2

Since your numbers are already sorted, removing dupes is easy. In C++, it's even built in as std::unique:

http://en.cppreference.com/w/cpp/algorithm/unique

Assuming you want to do it yourself, you can do it the same way unique does it:

int* unique (int* first, int* last)
{
  if (first==last) return last;

  int* result = first;
  while (++first != last)
  {
    if (!(*result == *first)) 
      *(++result)=*first;
  }
  return ++result;
}

edited Sep 22, 2013 at 12:10

user283145

answered Sep 20, 2013 at 20:01

StilesCrisis

16.4k4 gold badges44 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user2800463 Over a year ago

how do I use the above method, just started to learn so my pointers are that good. Can u help integrate the unique function into my code above

StilesCrisis Over a year ago

Just call unique(indexes, indexes + (sizeof(indexes) / sizeof(int)));

StilesCrisis Over a year ago

BTW, I recommend making a macro or something to get the array size. Something like #define arrsize(x) (sizeof(x) / sizeof(x[0]))

StilesCrisis Over a year ago

If you have a macro like that, it would be unique(indexes, indexes + arrsize(indexes));

Ed Heal · Accepted Answer · 2013-09-20 19:59:28Z

1

Yes

This can be achieved by mergesort. If both left and right are the same just merge the one value

answered Sep 20, 2013 at 19:59

Ed Heal

60.3k18 gold badges91 silver badges137 bronze badges

Comments

jfly · Accepted Answer · 2013-09-21 00:06:23Z

That's the code that removes the duplicates using mergesort. This snippet of code does the removing work:

else if(a[p1] == a[p2])
{
    merged[p] = a[p1];
    p1++;
    p2++;
}

That's the iterative merge sort while the recursive version would be easier.

#include <stdio.h>
#include <stdlib.h>

#define min(a,b) (((a) < (b)) ? (a) : (b))

int indexes[10] = { 0, 98, 45, 65, 45, 98, 78, 56, 65, 45 };

void merge(int *a, int s, int m, int e)
{
    int p1 = s;
    int p2 = m + 1;
    int * merged = (int*)malloc(sizeof(int) * (e - s + 1));
    int p = 0;
    while(p1 < m + 1 && p2 < e + 1)
    {
        if(a[p1] > a[p2])
        {
            merged[p] = a[p2];
            p2++;
        }
        else if(a[p1] == a[p2])
        {
            merged[p] = a[p1];
            p1++;
            p2++;
        }
        else
        {
            merged[p] = a[p1];
            p1++;
        }
        p++;
    }

    while(p1 < m + 1)
    {
        merged[p++] = a[p1++];
    }

    while(p2 < e + 1)
        merged[p++] = a[p2++];

    int i;
    for(i = 0;i < (e -s+1); i++)
    {
        a[s + i] = merged[i];
    }

    free(merged);
}

void merge_sort(int *a, int n)
{
    int width;
    for(width = 1; width < n; width = 2 * width)
    {
        int i;
        for(i = 0; i < n; i = i + 2 * width)
        {
            merge(a, i, min(i + width - 1, n - 1), min(i + 2 * width - 1, n - 1) );
        }
    }
}

void printIndexArray()
{    
    int i = 0;    
    for(i = 0; i < 10; i++)
    {    
        printf("i is %d\n", indexes[i]);    
    }
}

int main()
{
    merge_sort(indexes, sizeof(indexes) / sizeof(int) );
    printIndexArray();
    return 0;
}

wildplasser · Accepted Answer · 2013-09-22 14:07:48Z

#include <stdio.h>
#include <stdlib.h>

int indexes[10] = { 0, 98, 45, 65, 45, 98, 78, 56, 65, 45 };

size_t undup(int array[], size_t len)
{
size_t src,dst;

if (!len) return 0;
for (src=dst=1; src < len; src++) {
        if (array[dst-1] == array[src]) continue;
        array[dst++] = array[src];
        }
return dst;
}

int comp(const void * elem1, const void * elem2) {

    int f = *((int*) elem1);
    int s = *((int*) elem2);

    if (f > s)     return 1;
    if (f < s)     return -1;

    return 0;
}

void printIndexArray(size_t len) {
    size_t i = 0;
    for (i = 0; i < len; i++) {
        printf("array[%zu] is %d\n", i, indexes[i]);
    }
}

int main() {
    size_t len = 10;
    printf("Before sort\n" );
    printIndexArray(len);

    qsort(indexes, sizeof indexes / sizeof indexes[0], sizeof indexes[0], comp);
    printf("After sort\n" );
    printIndexArray(len);

    len = undup(indexes,10);
    printf("After undup\n" );
    printIndexArray(len);

    return 0;
}

cmaster - reinstate monica · Accepted Answer · 2013-09-20 23:55:16Z

The short answer is: yes.

The long answer is: it is always possible, but the complexity to do it depends heavily on the algorithm you use.

The more complex algorithms like quick-sort, slow-sort, bucket-sort, and straight-radix-sort do not lend themselves to such an enhancement, because they rely on the data being in a consecutive array, that can implicitly be split into subarrays. So, when you detect a duplicate, you cannot easily take it out. Again, it is possible, but certainly not a problem for beginners.

The less complex in-place algorithms like bubble-sort, insertion-sort, and shell-sort make it relatively easy: you can just replace one of the duplicates you detect with a sentinel value that sorts greater than all legal values, and let it rise to the top. After that, you just need to scoop off the cream of sentinel values and you are done.

The algorithms that really lend themselves to removing duplicates, are the ones that use intermediate arrays that grow/shrink in the process; in these cases you can just shrink or skip growing one of these intermediate arrays when you detect a duplicate. Candidates are merge-sort and heap-sort.

Note, however, that it is more prudent to just sort the array, and eliminate duplicates in a second, separate step. Why? Because eliminating duplicates adds complexity to the inner loop of the sorting algorithm, which is of O(n*log(n)) in most relevant cases. But eliminating duplicates from a sorted array is an O(n) operation, making the split operation faster than the fused one.

Collectives™ on Stack Overflow

sort and remove duplicates from int array in c

5 Answers 5

4 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

4 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related