Number of different elements in an array

Question

Is it possible to compute the number of different elements in an array in linear time and constant space? Let us say it's an array of long integers, and you can not allocate an array of length sizeof(long).

P.S. Not homework, just curious. I've got a book that sort of implies that it is possible.

In-place modification of the array won't change the linear time, because you can copy or duplicate an array in linear time. — Kevin Sylvestre
– Kevin Sylvestre, Commented Mar 23, 2010 at 17:20
@Kenny: Oops. Sorry about that, I missed that bit. You are correct. — Kevin Sylvestre
– Kevin Sylvestre, Commented Mar 23, 2010 at 18:13

Larry · Accepted Answer · 2010-03-23 17:22:16Z

3

This is the Element uniqueness problem, for which the lower bound is Ω( n log n ), for comparison-based models. The obvious hashing or bucket sorting solution all requires linear space too, so I'm not sure this is possible.

answered Mar 23, 2010 at 17:22

Larry

4,5892 gold badges26 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

ziggystar Over a year ago

I think the bucket solutions require constant space. Actually they require that the number of possible different values is constant. So a bucket (with a boolean) for each value is still constant space.

Larry Over a year ago

That's not a bucket, that's a hash. Yes, it is constant for some definition of constant, for example, you can allocate an array the size of the known universe every time. It does not further the conversation though. Of course, you can say to yourself, wait, there are only N possible unique values! So let's just use N buckets instead! Oh wait..

ziggystar Over a year ago

That's how it's done. I'm not saying this is a practical solution. It's a theoretical answer to a theoretical question.

Larry Over a year ago

I have a better solution for O(N) solution anyhow, "constant" space. Just precalculate the answers for all possible array size, elements. All you have to do now is look it up in the table!

ziggystar Over a year ago

Actually, that's a valid answer in the theory of computational complexity, given you're dealing with a finite number of different inputs. Btw, hashmaps are not constant in lookup time but linear or log, depending on the implementation.

Rex Kerr · Accepted Answer · 2010-03-23 17:23:49Z

1

You can't use constant space. You can use O(number of different elements) space; that's what a HashSet does.

answered Mar 23, 2010 at 17:23

Rex Kerr

168k27 gold badges325 silver badges411 bronze badges

4 Comments

user287792 Over a year ago

The asker probably wants O(log # of distinct elements) space.

Rex Kerr Over a year ago

@algorithmist: But that takes O(n log m) time, where m is the number of distinct elements. That's not linear.

user287792 Over a year ago

I didn't even specify an algorithm. How do you know how fast it runs? Usually these kinds of questions are best posed in the unit-cost model, which counts basic operations on (log n)-bit words.

Rex Kerr Over a year ago

I can make a good guess based upon how many bits of information about the answer you can possibly accumulate per step. n space allows for log n accumulation of information per step, but you don't allow that much, so you're limited to the n log n family of solutions.

user300653 · Accepted Answer · 2010-03-24 09:21:17Z

1

You can use any sorting algorithm and count the number of different adjacent elements in the array.

answered Mar 24, 2010 at 9:21

user300653

5374 silver badges3 bronze badges

Comments

Kevin Sylvestre · Accepted Answer · 2010-03-23 17:19:44Z

0

I do not think this can be done in linear time. One algorithm to solve in O(n log n) requires first sorting the array (then the comparisons become trivial).

answered Mar 23, 2010 at 17:19

Kevin Sylvestre

38.2k33 gold badges159 silver badges235 bronze badges

Comments

Rob Lachlan · Accepted Answer · 2010-03-23 17:22:36Z

0

If you are guaranteed that the numbers in the array are bounded above and below, by say a and b, then you could allocate an array of size b - a, and use it to keep track of which numbers have been seen.

i.e., you would move through your input array take each number, and mark a true in your target array at that spot. You would increment a counter of distinct numbers only when you encounter a number whose position in your storage array is false.

answered Mar 23, 2010 at 17:22

Rob Lachlan

14.5k5 gold badges51 silver badges100 bronze badges

Comments

user287792 · Accepted Answer · 2010-03-23 18:06:59Z

Assuming we can partially destroy the input, here's an algorithm for n words of O(log n) bits.

Find the element of order sqrt(n) via linear-time selection. Partition the array using this element as a pivot (O(n)). Using brute force, count the number of different elements in the partition of length sqrt(n). (This is O(sqrt(n)^2) = O(n).) Now use an in-place radix sort on the rest, where each "digit" is log(sqrt(n)) = log(n)/2 bits and we use the first partition to store the digit counts.

If you consider streaming algorithms only ( http://en.wikipedia.org/wiki/Streaming_algorithm ), then it's impossible to get an exact answer with o(n) bits of storage via a communication complexity lower bound ( http://en.wikipedia.org/wiki/Communication_complexity ), but possible to approximate the answer using randomness and little space (Alon, Matias, and Szegedy).

ziggystar · Accepted Answer · 2010-03-23 17:34:34Z

-1

This can be done with a bucket approach when assuming that there are only a constant number of different values. Make a flag for each value (still constant space). Traverse the list and flag the occured values. If you happen to flag an already flagged value, you've found a duplicate. You have to traverse the buckets for each element in the list. But that's still linear time.

answered Mar 23, 2010 at 17:34

ziggystar

28.7k9 gold badges76 silver badges127 bronze badges

4 Comments

Larry Over a year ago

Why would you make the assumption that there are only a constant number of different values?

ziggystar Over a year ago

Assuming we are talking about arrays that contain longs (or bytes or shorts to be more realistic), you have a constant space complexity. There are only 256 different byte values, 65336 values for short, etc...

Larry Over a year ago

So, what's your array size for an array with elements of type "string", length K each? What about arrays of arbitrary objects? How about providing a feasible answer?

ziggystar Over a year ago

I was talking about objects with a fixed number of possible values. Strings are not covered by my answer. I return the question: How about an array of objects of infinite size?

Collectives™ on Stack Overflow

Number of different elements in an array

7 Answers 7

5 Comments

4 Comments

Comments

Comments

Comments

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

5 Comments

4 Comments

Comments

Comments

Comments

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related