4

Input: Array of N positive numbers and a value X such that N is small compared to X
Output: Subarray with sum of all its numbers equal to Y > X, such that there is no other subarray with sum of its numbers bigger than X but smaller than Y.

Is there a polynomial solution to this question? If so, can you present it?

10
  • do you have a restriction on the elements in the array? Commented Feb 16, 2016 at 15:31
  • Do the sub array values have to be adjacent within the larger array? Does it matter if the solution is not unique? Commented Feb 16, 2016 at 15:33
  • @svs Generally No. But if it helps you can restrict them to no duplicates, positive values, smaller than 10^10 Commented Feb 16, 2016 at 15:37
  • Is O(X*Y) acceptable? Commented Feb 16, 2016 at 15:39
  • @JeffIrwin There could be multiple solutions equal to Y, I will be satisfied with finding just one of them. And there is no restriction of the result sub array. Commented Feb 16, 2016 at 15:39

3 Answers 3

4

As the other answers indicate this is a NP-Complete problem which is called the "Knapsack Problem". So there is no polynomial solution. But it has a pseudo polynomial time algorithm. This explains what pseudo polynomial is.

A visual explanation of the algorithm.

And some code.

If this is work related (I met this problem a few times already, in various disguises) I suggest introducing additional restrictions to simplify it. If it was a general question you may want to check other NP-Complete problems as well. One such list.

Edit 1:

AliVar made a good point. The given problem searches for Y > X and the knapsack problem searches for Y < X. So the answer for this problem needs a few more steps. When we are trying to find the minimum sum where Y > X we are also looking for the maximum sum where S < (Total - X). The second part is the original knapsack problem. So;

  1. Find the total
  2. Solve knapsack for S < (Total - X)
  3. Subtrack the list of items in knapsack solution from the original input.
  4. This should give you the minimum Y > X
Sign up to request clarification or add additional context in comments.

4 Comments

In the original Knapsack problem we have Y<X. Thus I think this problem needs a minor modification to be considered a Knapsack problem. Right?
My interpretation is in both cases the balance point is Y=X and we are trying to find the best approximation. I don’t think approaching it from the left or right changes the nature of the problem. But you are right my answer needs clarification. Please see Edit 1. Thanks for the input :)
This will not work for 3,1,9,10,7; x = 12. Total is 30. knapsack solution for Total - X = 11 > [10] . The subtraction from original input gives: [3,1,9,7] = 19. While the solution to the problem is [10, 3] =13.
I meant this: Total - X = 30 - 12 = 18; maxSum for Y < 18 = (7 + 10) = 17; MinSum for Y > 12 is there for => (3,1,9) = 13;
2

Let A be our array. Here is a O(X*N) algorithm:

initialize set S = {0}
initialize map<int, int> parent
best_sum = inf
best_parent = -1
for a in A
     Sn = {}
     for s in S
         t = s + a
         if t > X and t < best_sum
             best_sum = t
             best_parent = s
         end if
         if t <= X
             Sn.add(t)
             parent[t] = s
         end if
     end for
     S = S unite with Sn
end for

To print the elements in the best sum print the numbers:

Subarray = {best_sum - best_parent}
t = best_parent
while t in parent.keys()
    Subarray.add(t-parent[t])
    t = parent[t]
end while
print Subarray

The idea is similar to the idea of dynamic programming. We just calculate all reachable (those that could be obtained as a subarray sum) sums that are less than X. For each element a in the array A you could either choose to participate in the sum or not. At the update step S = S unite with Sn S represent all sums in which a does not participate while Sn all sum in which a do participate.

You could represent S as a boolean array setting a item true if this item is in the set. Note that the length of this boolean array would be at most X.

Overall, the algorithm is O(X*N) with memory usage O(X).

4 Comments

How do you choose the value Y? Also can you give a brief words explanation to what you are doing here. Btw I didn't saw that you used X.
What is p? I assumed it is the same as parent, but when I try to print the elements, it does not work. However, your algorithm does give the correct best_sum (verified by brute-force).
@JeffIrwin yea, p should be parent. It should be fine. I will try to test it.
@svs I think my original implementation was wrong, but your edit cleared things up. I was getting the correct sum with an off-by-one error in parent, now it all works.
1

I think this problem is NP-hard and the subset sum can be reduced to it. Here is my reduction: For an instance of the subset sum with set S={x1,...,xn} it is desired to find a subset with sum t. Suppose d is the minimum distance between two non-equal xi and xj. Build S'={x1+d/n,...,xn+d/n} and feed it to your problem. Suppose that your problem found an answer; i.e. a subset D' of S' with sum Y>t which is the smallest sum with this property. Name the set of original members of D' as D. Three cases may happen: 1) Y = t + |D|*d/n which means D is the solution to the original subset sum problem. 2) Y > t + |D|*d/n which means no answer set can be found for the original problem. 3) Y < t + |D|*d/n. In this case assign t=Y and repeat the problem. Since the value for the new t is increased, this case will not repeat exponentially. Therefore, the procedure terminates in polynomial time.

3 Comments

Looks interesting! I edited the question based on your answer to be more specific in what am I looking for. Please tell me if this change your result.
You have the right idea, but case 3 is a problem -- it's not clear that t will increase fast enough. The problem occurs because you aren't choosing d quite right... Even if the smallest difference of any non-equal xi and xj is d, the smallest nonzero difference between the sums of any two disjoint sets of numbers in S could be arbitrarily smaller -- e.g. S={1, 3, 4, 5, 6.001, 7.001} has d=1 and thus d/n=1/6, but can produce a difference of (6.001+7.001)-(1+3+4+5) = 0.002. Instead, pick d to be the largest common divisor of all numbers in S ...
... Now the difference between any two disjoint sets of numbers in S must be an integer multiple of d, which means that case 3 can't happen -- no other subset E of S can map to an E' of S' that "sneaks in front of" the "right" D' (i.e. the one that maps back to D as per case 2). (Note: taking LCDs implies that the algorithm will only work for rational numbers, which is only a restriction in theory, since in practice irrational numbers can't be represented on computers anyway...)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.