Binary search in 2 sorted integer arrays

Question

There is a big array which consists of 2 small integer arrays written one at the end of another. Both small arrays are sorted by ascending. We have to find an element in big array as fast, as possible. My idea was to find the end of the left array by binsearch in big array and then implement 2 binsearches on small arrays. The problem is that I don't know how to find that end. If you have an idea, how to find element without finding borders of smaller arrays, you're welcome!

Information about arrays: both small arrays have integer elements, both are sorted by ascending, they both can have length from 0 to any positive integer number, but there can be only one copy of an element. Here are some examples of big arrays:

1 2 3 4 5 6 7 (all the elements of the second array are bigger, than the maximum of the first array)
100 1 (both arrays have only one element)
1 3 5 2 4 6 or 2 4 6 1 3 5 (most common situations)

And your question is? Also if you know something and can't prove it, by definition are you not assuming you are correct? Just saying might want to put a little more thought and a question mark into your "question." — CalebB
– CalebB, Commented Dec 6, 2015 at 20:52
O(n) would be a linear search, i.e. you can look at every element in the "big array". If you look at every element, then you must find the one you are looking for, if it exists. Where does binary search come in to it? — Andrew Morton
– Andrew Morton, Commented Dec 6, 2015 at 20:56
I think I can give you an answer, but I'll need more information about the two arrays [call them A and B]. Q1: Do they have overlapping values? Are all values in A < all in B? Q2: Do you always have (e.g.) A:1,2,3 B:7,8,9 or can you have A:1,2,3,4,5 B:4,7,9? Q3: What about the relative lengths of A/B? Basically, add as much information as you can. Do you have representative sample data that you can post, multiple cases showing all possibilities would help. — Craig Estey
– Craig Estey, Commented Dec 6, 2015 at 23:20
Well, to be honest, I do not have any other data about arrays. I've also been thinking a lot about this task and I understood, that it is probably impossible to complete the task without more information. So, as a result, the search in such an array can't be made less, than with O(n) time without more data, but unfrotunatly I haven't proved this statement yet. Do you agree with this idea? — Cambronnes
– Cambronnes, Commented Dec 7, 2015 at 5:57
No, I'm going to disagree--to your benefit :-). I still think it's doable in less than O(n). Without the "narrowing" info, this just means that all "cases" have to be tried until a match is found. I think there's only about 8 [(e.g.) with more info, this might be [say] 4]. Even if we only hit it on the last one, it's still better than linear. I'm working on a proof-of-concept program that will test for all possible scenarios. But, it's late here, so I'll have to push this until tomorrow. — Craig Estey
– Craig Estey, Commented Dec 7, 2015 at 8:54

David · Accepted Answer · 2015-12-07 22:58:43Z

2

This problem is impossible to solve in guaranteed time complexity faster than O(n) and not possible to solve at all for certain arrays. Binary search runs in O(log n) for a sorted array, but the big array is not guaranteed to be sorted and will in the worst-case require one or more comparisions per element, which is O(n). The best guaranteed time complexity is O(n) with the trivial algorithm: compare every item with its neighbour until you find the "turning point" with A[i] > A[i+1]. However, if you use a breadth-first search, you may get lucky and find the "turning point" early.

Proof that the problem is unsolvable for some arrays: let the array M = [A B] be our big array. To find the point where the arrays meet we're looking for an index i where M[i] > M[i+1]. Now let A=[1 2 3] and B=[4 5]. There is no index in the array M for which the condition holds true, thus the problem is unsolvable for some arrays.

Informal proof for the former: let M=[A B] and A=[1..x] and B=[(x+1)..y] be two sorted arrays. Then swap the positions of element x and y in M. We have no way of finding the index of x without (in the worst case) checking every index, thus the problem is O(n).

Binary search relies on being able to eliminate half the solution space with each comparision, but in this case we cannot eliminate anything from the array and so we cannot do better than a linear search.

(From a practical standpoint, you should never do this in a program. The two arrays should be separate. If this isn't possible, append the length of either array to the bigger array.)

Edit: changed my answer after question was updated. It's possible to do it faster than linear time for some arrays, but not all possible arrays. Here's my idea for an algorithm using breadth-first search:

Start with the interval [0..n-1] where n is the length of the big array.
Make a list of intervals and put the starting interval in it.
For each interval in the list:
    if the interval is only two elements and the first element is greater than the last
        we found the turning point, return it
    else if the interval is two elements or less
        remove it from the list
    else if the first element of the interval is greater than the last
        turning point is in this interval
        clear the list
        split this interval in two equal parts and add them to the list
    else
        split this interval in two equal parts and replace this interval in the list with the two parts

I think a breadth-first approach will increase the odds of finding an interval where A[first] > A[last] early. Note that this approach will not work if the turning point is between two intervals, but it's something to get you started. I would test this myself, but unfortunately I don't have the time now.

edited Dec 7, 2015 at 22:58

answered Dec 7, 2015 at 8:58

David

9531 gold badge10 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Emil Vikström Over a year ago

In your first example it doesn't (or shouldn't) matter for the user if the real border between the arrays are found or if you consider the border to be at the beginning or the end of the big array (thus treating the problem as if A or B were empty).

David Over a year ago

@EmilVikström well it's not clear what the values are going to be used for. For example, let's say A is the length of all girls in a class and B the length of all boys. It might be important for an application to maintain the distinction and not put the tallest girl with the boys.

Emil Vikström Over a year ago

True! If the values of the two arrays are not directly comparable we would indeed need to know the exact array boundary. I also realize that the question as asked is how to find that boundary, not really how to find the element. So given those constraints your example works perfectly and would probably give bonus points on a homework assignment (if the student realized it herself instead of asking).

Cambronnes Over a year ago

@David Thanks for replying! I'm sorry that I haven't formulated the problem clear enough. So, I need to find an element in array, which is made by algorithm described in the title. I meant to find the end of the left array to make then binsearch on both arrays. If there is no border between arrays, it's just amazing, because I will have to make only 1 binsearch. Once again, the aim is to find the element as fast as it is possible and I really do not catch the idea, why there is no algorithm taking less, than O(n) to find an element in such an array.

Emil Vikström Over a year ago

Cambronnes, you haven't specified if the two arrays represent the same type of data. Is it all just integers? Can we assume that for example 3 in the first array means the same thing as 3 in the second?

|

Collectives™ on Stack Overflow

Binary search in 2 sorted integer arrays

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related