0

I am trying to count the number of unique numbers in a sorted array using binary search. I need to get the edge of the change from one number to the next to count. I was thinking of doing this without using recursion. Is there an iterative approach?

def unique(x):
    start = 0
    end = len(x)-1
    count =0
    # This is the current number we are looking for
    item = x[start]

    while start <= end:
        middle = (start + end)//2
        
        if item == x[middle]:
            start = middle+1
            
        elif item < x[middle]:
            end = middle -1
        
        #when item item greater, change to next number
        count+=1

    # if the number
    return count
    
unique([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5])

Thank you.

Edit: Even if the runtime benefit is negligent from o(n), what is my binary search missing? It's confusing when not looking for an actual item. How can I fix this?

16
  • This isn't recursive. But I'm not sure it even works, you never change item. Commented Aug 14, 2020 at 3:21
  • It's returning 5 in your example, the correct answer is 3. Commented Aug 14, 2020 at 3:23
  • it's not counting unique values, it's counting the number of iterations of binary search it takes to find the first element. Commented Aug 14, 2020 at 3:24
  • @Barmar yes I said I am looking for an iterative approach. I fixed return value. Yes, I know it's wrong that's why I posted here. Commented Aug 14, 2020 at 3:25
  • 2
    Seems that complexity for BS approach is O(klogn) where n is array size, k is number of unique items. When k is comparable with n, time becomes O(nlogn) - definitely slower than linear scan Commented Aug 14, 2020 at 3:47

2 Answers 2

1

Working code exploiting binary search (returns 3 for given example).

As discussed in comments, complexity is about O(k*log(n)) where k is number of unique items, so this approach works well when k is small compared with n, and might become worse than linear scan in case of k ~ n

def countuniquebs(A):
    n = len(A)
    t = A[0]
    l = 1
    count = 0
    while l < n - 1:
        r = n - 1
        while l < r:
            m = (r + l) // 2
            if A[m] > t:
                r = m
            else:
                l = m + 1
        count += 1
        if l < n:
            t = A[l]
    return count

print(countuniquebs([1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5]))
Sign up to request clarification or add additional context in comments.

Comments

1

I wouldn't quite call it "using a binary search", but this binary divide-and-conquer algorithm works in O(k*log(n)/log(k)) time, which is better than a repeated binary search, and never worse than a linear scan:

def countUniques(A, start, end):
    len = end-start
    if len < 1:
        return 0
    if A[start] == A[end-1]:
        return 1
    if len < 3:
        return 2
    mid = start + len//2
    return countUniques(A, start, mid+1) + countUniques(A, mid, end) - 1

A = [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,3,4,5,5,5,5,5,5,5,5,5,5]
print(countUniques(A,0,len(A)))

1 Comment

This is an awesome solution. Can you explain a bit about why the runtime ends up being O(k*log(n)/log(k))? This seems right since as k approaches N, the runtime approaches O(N).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.