4

I am looking for an algorithm to solve the following problem: We are given an integer array of size n which contains k (0 < k < n) many elements exactly once. Every other integer occurs an even number of times in the array. The output should be any of the k unique numbers. k is a fixed number and not part of the input.

An example would be the input [1, 2, 2, 4, 4, 2, 2, 3] with both 1 and 3 being a correct output.

Most importantly, the algorithm should run in O(n) time and require only O(1) additional space.

edit: There has been some confusion regarding whether there is only one unique integer or multiple. I apologize for this. The correct problem is that there is an arbitrary but fixed amount. I have updated the original question above.

"Dante." gave a good answer for the case that there are at most two such numbers. This link also provides a solution for three. "David Eisenstat" commented that it is also possible to do for any fixed k. I would be grateful for a solution.

12
  • I'm positive that I have seen this on SO already, because the answer stuck with me. Curiously, I can't find it.... Wait, this should be it. Commented Sep 11, 2015 at 13:49
  • 2
    @G.Bach Might be this one, no? stackoverflow.com/a/7907402/5325238 Commented Sep 11, 2015 at 13:55
  • 1
    Is it certain that there can be 2 unique elements? Commented Sep 11, 2015 at 14:07
  • 2
    This one also seems related. Commented Sep 11, 2015 at 14:08
  • 1
    If the number of unique integers is not fixed or at least bounded, then the space complexity must be O(n) in order to hold the result for each unique number. Commented Sep 11, 2015 at 14:19

6 Answers 6

11

There is a standard algorithm to solve such problems using XOR operator:

Time Complexity = O(n)

Space Complexity = O(1)

Suppose your input array contains only one element that occurs odd no of times and rest occur even number of times,we take advantage of the following fact:

Any expression having even number of 0's and 1's in any order will always be = 0 when xor is applied.

That is

0^1^....... = 0 as long as number of 0 is even and number of 1 is even 

and 0 and 1 can occur in any order.

Because all numbers that occur even number of times will have their corresponding bits form even number of 1's and 0's and only the number which occurs only once will have its bit left out when we take xor of all elements of array because

0(from no's occuring even times)^1(from no occuring once) = 1 

0(from no's occuring even times)^0(from no occuring once) = 0

as you can see the bit of only the number occuring once is preserved.

This means when given such an array and you take xor of all the elements,the result is the number which occurs only once.

So the algorithm for array of length n is:

 result = array[0]^array[1]^.....array[n-1] 

Different Scenario

As the OP mentioned that input can also be an array which has two numbers occuring only once and rest occur even number of times.

This is solved using the same logic as above but with little difference.

Idea of algorithm:

If you take xor of all the elements then definitely all the bits of elements occuring even number of times will result in 0,which means:

The result will have its bit 1 only at that bit position where the bits of the two numbers occuring only once differ.

We will use the above idea.

Now we focus on the resultant xor bit which is 1(any bit which is 1) and make rest 0.The result is a number which will allow us to differentiate between the two numbers(the required ones).

Because the bit is 1,it means they differ at this position,it means one will have 0 at this position and one will have 1.This means one number when taken AND results in 0 and one does not.

Since it is very easy to set the right most bit,we set it of the result xor as

 A = result & ~(result-1)

Now traverse through the array once and if array[i]&A is 0 store the number in variable number_1 as

 number_1 = number_1^array[i]

otherwise

 number_2 = number_2^array[i]

Because the remaining numbers occur even number of times,their bit will automatically disappear.

So the algorithm is

1.Take xor of all elements,call it xor.

2.Set the rightmost bit of xor and store it in B.

3.Do the following:

number_1=0,number_2=0;
for(i = 0 to n-1)
{
 if(array[i] & B)
  number_1 = number_1^array[i];
 else
  number_2 = number_2^array[i];
}

The number_1 and number_2 are the required numbers.

Sign up to request clarification or add additional context in comments.

13 Comments

Can you generalize this to k elements that occur only once?
@aquinas No we cannot. At least according to my past lectures on algorithms.
You cannot generalize because even for k=3 you can construct a sequence whose XOR is 0. Example: {1, 2, 3}. All bits appears an even number of times even though each number appears once.
@aquinas The OP should have mentioned that. I can assure you that given the time and space constraints,this is the right way. I have done problems like this in past.Actually, we need to remember that not all questions are perfectly specific.
@fjardon Thats what i said.Anyway:):).
|
3

Here's a Las Vegas algorithm that, given k, the exact number of elements that occur an odd number of times, reports all of them in expected time O(n k) (read: linear-time when k is O(1)) and space O(1) words, assuming that "give me a uniform random word" and "give me the number of 1 bits set in this word (popcount)" are constant-time operations. I'm pretty sure that I'm not the first person to come up with this algorithm (and I'm not even sure that I'm remembering all of the refinements), but I've reached the limits of my patience trying to find it.

The central technique is called random restrictions. Essentially what we do is to filter the input randomly by value, in the hope that we retain exactly one odd-count element. We apply the classic XOR algorithm to the filtered array and check the result; if it succeeded, then we pretend to add it to the array, to make it even-count. Repeat until all k elements are found.

The filtration process goes like this. Treat each input word x as a binary vector of length w (doesn't matter what w is). Compute a random binary matrix A of size w by ceil(1 + lg k) and a random binary vector b of length ceil(1 + lg k). We filter the input by retaining those x such that Ax = b, where the left-hand side is a matrix multiplication mod 2. In implementation, A is represented as ceil(1 + lg k) vectors a1, a2, .... We compute the bits of Ax as popcount(a1 ^ x), popcount(a2 ^ x), .... (This is convenient because we can short-circuit the comparison with b, which shaves a factor lg k from the running time.)

The analysis is to show that, in a given pass, we manage with constant probability to single out one of the odd-count elements. First note that, for some fixed x, the probability that Ax = b is 2-ceil(1 + lg k) = Θ(1/k). Given that Ax = b, for all y ≠ x, the probability that Ay = b is less than 2-ceil(1 + lg k). Thus, the expected number of elements that accompany x is less than 1/2, so with probability more than 1/2, x is unique in the filtered input. Sum over all k odd-count elements (these events are disjoint), and the probability is Θ(1).


Here's a deterministic linear-time algorithm for k = 3. Let the odd-count elements be a, b, c. Accumulate the XOR of the array, which is s = a ^ b ^ c. For each bit i, observe that, if a[i] == b[i] == c[i], then s[i] == a[i] == b[i] == c[i]. Make another pass through the array, accumulate the XOR of the lowest bit set in s ^ x. The even-count elements contribute nothing again. Two of the odd-count elements contribute the same bit and cancel each other out. Thus, the lowest bit set in the XOR is where exactly one of the odd-count elements differs from s. We can use the restriction method above to find it, then the k = 2 method to find the others.

5 Comments

Why is the probability that some y!=x satisfies Ay=b less than that term? (2^(-ceil(1 + lg(k)))). And from that, how do you get to the expected number of elements being less than 1/2?
@AndreasT The probability that Ay=b is the probability that A(y-x) = 0, and y-x is nonzero, so half of the possible rows in A "disqualify" y. The strictness is because x is never disqualified. The expected number of elements is by summing probabilities and observing that the sum is less than k 2^-ceil(1 + lg k) < 1/2.
That makes sense for the probability of Ay=b. For the expected number of elements, do I understand correctly that you XOR all those y with the initial x? That way the non-unique elements which satisfy the equation cancel out themselves. That is why you only have to sum over k, not over n. And you keep repeating this process until the result is x again?
@AndreasT Right -- there may be many even-occurring elements that satisfy the equation, but they cancel.
Thanks, I get it now. Randomized algorithms are fascinating. This seems to be the most general answer (with k being arbitrary), so I'll mark it as best solution.
3

The question title says "the unique integer", but the question body says there can be more than one unique element.

If there is in fact only one non-duplicate: XOR all the elements together. The duplicates all cancel, because they come in pairs (or higher multiples of 2), so the result is the unique integer.

See Dante's answer for an extension of this idea that can handle two unique elements. It can't be generalized to more than that.

Perhaps for k unique elements, we could use k accumulators to track sum(a[i]**k). i.e. a[i], a[i]2, etc. This probably only works for Faster algorithm to find unique element between two arrays?, not this case where the duplicates are all in one array. IDK if an xor of squares, cubes, etc. would be any use for resolving things.

2 Comments

Your answer does not cover the input which has two integers that occur once.
Its the example given by OP.
1

Track the counts for each element and only return the elements with a count of 1. This can be done with a hash map. The below example tracks the result using a hash set while it's still building the counts map. Still O(n) but less efficient, but I think it's slightly more instructive.

Javascript with jsfiddle http://jsfiddle.net/nmckchsa/

function findUnique(arr) {
    var uniq = new Map();
    var result = new Set();
    // iterate through array
    for(var i=0; i<arr.length; i++) {
        var v = arr[i];
        // add value to map that contains counts
        if(uniq.has(v)) {
            uniq.set(v, uniq.get(v) + 1);
            // count is greater than 1 remove from set
            result.delete(v);
        } else {
            uniq.set(v, 1);
            // add a possibly uniq value to the set
            result.add(v);
        }
    }
    // set to array O(n)
    var a = [], x = 0;
    result.forEach(function(v) { a[x++] = v; });
    return a;
}
alert(findUnique([1,2,3,0,1,2,3,1,2,3,5,4,4]));

EDIT Since the non-uniq numbers appear an even number of times @PeterCordes suggested a more elegant set toggle.

Here's how that would look.

function findUnique(arr) {
    var result = new Set();
    // iterate through array
    for(var i=0; i<arr.length; i++) {
        var v = arr[i];
        if(result.has(v)) { // even occurances
            result.delete(v);
        } else { // odd occurances
            result.add(v);
        }
    }
    // set to array O(n)
    var a = [], x = 0;
    result.forEach(function(v) { a[x++] = v; });
    return a;
}

JSFiddle http://jsfiddle.net/hepsyqyw/

11 Comments

You don't need uniq at all. Toggle v's membership in the set when you see it. Elements that appear an even number of times won't be part of the set when you're done. That's still O(n) additional space, since there is no guarantee that the duplicates come in nearby pairs. The question asked for O(1) space complexity.
No hash table guarantees O(1) lookup.
@MichaelLaszlo: Are you replying to my comment about space? Or are you saying you can't implement a hash table with guaranteed O(1) time lookups for worst-case inputs? I think you can get amortized O(1) lookups by growing the hash table when / if needed.
No, the worst case is O(n) amortized time per operation.
@MichaelLaszlo: Linear search is O(n) per operation. Hash lookup is O(n) amortized time per n operations, aka amortized constant time.
|
-1

Assuming you have an input array: [2,3,4,2,4] Output: 3

In Ruby, you can do something as simple as this:

[2,3,4,2,4].inject(0) {|xor, v| xor ^ v}

Comments

-3
  1. Create an array counts that has INT_MAX slots, with each element initialized to zero.

  2. For each element in the input list, increment counts[element] by one. (edit: actually, you will need to do counts[element] = (counts_element+1)%2, or else you might overflow the value for really ridiculously large values of N. It's acceptable to do this kind of modulus counting because all duplicate items appear an even number of times)

  3. Iterate through counts until you find a slot that contains "1". Return the index of that slot.

Step 2 is O(N) time. Steps 1 and 3 take up a lot of memory and a lot of time, but neither one is proportional to the size of the input list, so they're still technically O(1).


(note: this assumes that integers have a minimum and maximum value, as is the case for many programming languages.)

21 Comments

By that standard of "technically correct", every deterministic program ever written that terminates does so in constant, i.e. O(1) time because it works on finite memory.
HA! That's a pretty fuzzy idea for O(1). What if it's a 64 bit integer? You're going to create an array that 2^64 and claim the memory space is constant? I mean, yes but...no. :)
Indeed the required space is constant in a technical sense, but I doubt that the OP had this in mind. For 32-bit integers, the table would need 4 gigabytes of memory. Most probably the OP imagined some constant space independent of the size of the actual implementation of int.
Reductio ad absurdum: If Kevin's argument is correct, then sorting takes O(n) time because you can always use radix sort with an array of size INT_MAX.
Then you're talking about the finite state machines that real world computers are, not algorithms in the relevant sense. In that case, everything that can be achieved can be achieved in O(1) time AND space, and the concept becomes irrelevant.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.