2

I have an array of structs; the array is of size N.

I want to remove duplicates from the array; that is, do an in-place change, converting the array to have a single appearance of each struct. Additionally, I want to know the new size M (highest index in the reduced array).

The structs include primitives so it's trivial to compare them.

How can I do that efficiently in C++?

I have implemented the following operators:

bool operator==(const A &rhs1, const A &rhs2) 
{       
    return ( ( rhs1.x== rhs2.x )  &&
             ( rhs1.y == rhs2.y ) );
}

bool operator<(const A &rhs1, const A &rhs2) 
{       
    if ( rhs1.x == rhs2.x )  
             return ( rhs1.y < rhs2.y );

    return ( rhs1.x < rhs2.x );
}

However, I get an error when running:

std::sort(array, array+ numTotalAvailable);

 * array will have all elements here valid.

std::unique_copy(
        array, 
        array+ numTotalAvailable, 
        back_inserter(uniqueElements)); 

 * uniqueElements will have non-valid elements.

What is wrong here?

3
  • What is your definition of array? Real array s have a compile time size and cannot be shrinked, dynamic arrays (allocated with new) have a runtime size, but that size cannot be changed after the array is allocated... Are you referring to std::vector? Commented Aug 24, 2011 at 18:05
  • i want to shrink the number of indices and get it back! i am not trying to shrink the array memory. Commented Aug 24, 2011 at 18:26
  • OK, if i resize uniqueElements to be big enough then it works! however, i don't know how big it should be, how can i know that? Commented Aug 24, 2011 at 19:01

4 Answers 4

6

You could use a combination of the std::sort and std::unique algorithms to accomplish this:

std::sort(elems.begin(), elems.end());                  // Now in sorted order.
iterator itr = std::unique(elems.begin(), elems.end()); // Duplicates overwritten
elems.erase(itr, elems.end());                          // Space reclaimed

If you are working with a raw array (not, say, a std::vector), then you can't actually reclaim the space without copying the elements over to a new range. However, if you're okay starting off with a raw array and ending up with something like a std::vector or std::deque, you can use unique_copy and an iterator adapter to copy over just the unique elements:

std::sort(array, array + size); // Now in sorted order

std::vector<T> uniqueElements;
std::unique_copy(array, array + size,
                 back_inserter(uniqueElements)); // Append unique elements

At this point, uniqueElements now holds all the unique elements.

Finally, to more directly address your initial question: if you want to do this in-place, you can get the answer by using the return value from unique to determine how many elements remain:

std::sort(elems, elems + N);                // Now in sorted order.
T* endpoint = std::unique(elems, elems + N);// Duplicates overwritten
ptrdiff_t M = endpoint - elems;             // Find number of elements left

Hope this helps!

Sign up to request clarification or add additional context in comments.

7 Comments

This is the way to go, but note that this does not work on arrays, but on STL containers (consider using std::vector), but then again, you cannot shrink a proper array anyway!
@sramij- In order for the sort and unique calls to work, you'll either need to provide a comparison function as a final parameter, or you'll need to overload operator< and operator== for the struct type. The comparison function for sort should take in two structs and return if the first comes before the second (note that this is different from how qsort and bsearch take their arguments), while the comparison for unique should take in two structs and report whether they're equal.
how can i copy now from uniqueElements vector into another array2 ?
You probably wouldn't. If you wanted to create a new array, the best option would be to use the last code snippet to apply unique in-place, count up how many elements there are, and then allocate a new array with exactly that much storage space. From there, you could use std::copy to copy the elements over.
You will need to define some sort of ordering, since otherwise you can't use sort to gather together equivalent elements. Even if you don't have a natural ordering, you can define one as follows. Start by comparing the first field of the two structs. If they're not equal, return whether the lhs's field is less than the rhs's field. Otherwise, go to the second field and repeat this check, then go to the third, etc. You don't need to define operator< to do this; just define a comparison function and provide it as the final parameter to sort.
|
1
std::set<T>  uniqueItems(v.begin(), v.end());

Now uniqueItems contains only the unique items. Do whatever you want to do with it. Maybe, you would like v to contain all the unique items. If so, then do this:

//assuming v is std::vector<T>
std::vector<T>(uniqueItems.begin(), uniqueItems.end()).swap(v);

Now v contains all the unique items. It also shrinks v to a minimum size. It makes use of Shrink-to-fit idiom.

2 Comments

i cannot define any operators for my struct, and these algorithms require that.
@sramiji: Not necessarily. You can provide free compare() function as well.
0

You could use the flyweight pattern. Easiest way to do so, would be using the Boost Flyweight library.

Edit: I'm not sure if there is some way to find out how many objects are stored by the Boost flyweight implementation, if there is, I can't seem to find it in the documentation.

Comments

0

An alternative approach to applying algorithms to your array would be to insert its elements in a std::set. Whether it is reasonable to do it this way depends on how you plan to use your items.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.