1

I have an array of integers. I want to find all the distinct elements in the array using c++. solutin 1: BRUTE FORCE using nested loop, complexity of this solution is O(n^2)

solution 2: SORTING this will take O(nLog n)

Is there any other technique which can give better results than O(n Log n)? any other data structure or any different technique?

1
  • By distinct you mean, that have frequency 1? Commented Aug 22, 2015 at 4:57

3 Answers 3

4

Using a std::unordered_set will be O(n).

Sign up to request clarification or add additional context in comments.

2 Comments

The average complexity for hash tables is O(n), not the worst-case. It is possible to do this in O(n) when the max. integer is known and small enough.
@Jens Hash table insertion is O(1) amortized, which is a stronger guarantee than merely average case. It is not possible to find "worst-case" inputs for which the average insert in a (long enough) sequence of inserts is worse than O(1).
0

You can also try using std::nth_element: It partially sorts the range [first, n-th, last) such that all entries in the interval [first, n-th) are <= n-th, and all elements in the interval (n-th, last) are >= than n-th. It has linear complexity (O(n)), and will find the n-th element in a sequence. It's not suited to find a specific number, though, so maybe it's not exactly what you need. But it's worth to keep it in mind :-)

Comments

-1

If you know the maximum integer number, and it is reasonably small, you cann allocate a large vector and use that to count the frequency of each integer. Then, iterate over the vector and find all with frequency one:

template<typename I>
auto findWithFrequency(int f, int max, I first, I last)
{
    std::vector<int> counts(max, 0);
    for(; first != last; ++first)
    {
        counts[*first] += 1;
    }

    std::vector<typename I::value_type> v;
    v.reserve( std::distance(first, last) );

    std::copy_if(counts.begin(), counts.end(),
                 std::back_inserter(v),
                 [f](auto x) {return x == f;});

    return v;
}

In the worst case, this needs two iterations over arrays of the size of the input array, so the complexity is O(n).

This is essentially the idea behind Bucketsort or Radix.

3 Comments

your Technic Does decrease the time less than . O(n Log n) . maybe HASH would help .
@sam I don't understand your comment. The algorithm first iterates over the input array, which is O(n). It then iterates over the bucket array, which is also O(n). Where would a hash function help? The complexity for hash tables is O(n) on average, but Bucketsort or Radixsort have a worst-case complexity of O(n).
Would the downvoter comment why the -1? I think it is a reasonable approach for restricted integers.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.