Smallest sum of subarray with sum greater than a given value

Question

Input: Array of N positive numbers and a value X such that N is small compared to X
Output: Subarray with sum of all its numbers equal to Y > X, such that there is no other subarray with sum of its numbers bigger than X but smaller than Y.

Is there a polynomial solution to this question? If so, can you present it?

Do the sub array values have to be adjacent within the larger array? Does it matter if the solution is not unique? — Jeff Irwin
– Jeff Irwin, Commented Feb 16, 2016 at 15:33
@svs Generally No. But if it helps you can restrict them to no duplicates, positive values, smaller than 10^10 — Ilya Gazman
– Ilya Gazman, Commented Feb 16, 2016 at 15:37
@JeffIrwin There could be multiple solutions equal to Y, I will be satisfied with finding just one of them. And there is no restriction of the result sub array. — Ilya Gazman
– Ilya Gazman, Commented Feb 16, 2016 at 15:39

Orhan Tuncer · Accepted Answer · 2016-02-17 08:06:21Z

4

As the other answers indicate this is a NP-Complete problem which is called the "Knapsack Problem". So there is no polynomial solution. But it has a pseudo polynomial time algorithm. This explains what pseudo polynomial is.

A visual explanation of the algorithm.

And some code.

If this is work related (I met this problem a few times already, in various disguises) I suggest introducing additional restrictions to simplify it. If it was a general question you may want to check other NP-Complete problems as well. One such list.

Edit 1:

AliVar made a good point. The given problem searches for Y > X and the knapsack problem searches for Y < X. So the answer for this problem needs a few more steps. When we are trying to find the minimum sum where Y > X we are also looking for the maximum sum where S < (Total - X). The second part is the original knapsack problem. So;

Find the total
Solve knapsack for S < (Total - X)
Subtrack the list of items in knapsack solution from the original input.
This should give you the minimum Y > X

edited Feb 17, 2016 at 8:06

answered Feb 16, 2016 at 21:00

Orhan Tuncer

1545 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

AliVar Over a year ago

In the original Knapsack problem we have Y<X. Thus I think this problem needs a minor modification to be considered a Knapsack problem. Right?

Orhan Tuncer Over a year ago

My interpretation is in both cases the balance point is Y=X and we are trying to find the best approximation. I don’t think approaching it from the left or right changes the nature of the problem. But you are right my answer needs clarification. Please see Edit 1. Thanks for the input :)

Ilya Gazman Over a year ago

This will not work for 3,1,9,10,7; x = 12. Total is 30. knapsack solution for Total - X = 11 > [10] . The subtraction from original input gives: [3,1,9,7] = 19. While the solution to the problem is [10, 3] =13.

Orhan Tuncer Over a year ago

I meant this: Total - X = 30 - 12 = 18; maxSum for Y < 18 = (7 + 10) = 17; MinSum for Y > 12 is there for => (3,1,9) = 13;

sve · Accepted Answer · 2016-02-16 21:58:26Z

2

Let A be our array. Here is a O(X*N) algorithm:

initialize set S = {0}
initialize map<int, int> parent
best_sum = inf
best_parent = -1
for a in A
     Sn = {}
     for s in S
         t = s + a
         if t > X and t < best_sum
             best_sum = t
             best_parent = s
         end if
         if t <= X
             Sn.add(t)
             parent[t] = s
         end if
     end for
     S = S unite with Sn
end for

To print the elements in the best sum print the numbers:

Subarray = {best_sum - best_parent}
t = best_parent
while t in parent.keys()
    Subarray.add(t-parent[t])
    t = parent[t]
end while
print Subarray

The idea is similar to the idea of dynamic programming. We just calculate all reachable (those that could be obtained as a subarray sum) sums that are less than X. For each element a in the array A you could either choose to participate in the sum or not. At the update step S = S unite with Sn S represent all sums in which a does not participate while Sn all sum in which a do participate.

You could represent S as a boolean array setting a item true if this item is in the set. Note that the length of this boolean array would be at most X.

Overall, the algorithm is O(X*N) with memory usage O(X).

edited Feb 16, 2016 at 21:58

answered Feb 16, 2016 at 15:58

sve

4,3661 gold badge21 silver badges31 bronze badges

4 Comments

Ilya Gazman Over a year ago

How do you choose the value Y? Also can you give a brief words explanation to what you are doing here. Btw I didn't saw that you used X.

Jeff Irwin Over a year ago

What is p? I assumed it is the same as parent, but when I try to print the elements, it does not work. However, your algorithm does give the correct best_sum (verified by brute-force).

sve Over a year ago

@JeffIrwin yea, p should be parent. It should be fine. I will try to test it.

Jeff Irwin Over a year ago

@svs I think my original implementation was wrong, but your edit cleared things up. I was getting the correct sum with an off-by-one error in parent, now it all works.

AliVar · Accepted Answer · 2016-02-16 19:40:54Z

1

I think this problem is NP-hard and the subset sum can be reduced to it. Here is my reduction: For an instance of the subset sum with set S={x1,...,xn} it is desired to find a subset with sum t. Suppose d is the minimum distance between two non-equal xi and xj. Build S'={x1+d/n,...,xn+d/n} and feed it to your problem. Suppose that your problem found an answer; i.e. a subset D' of S' with sum Y>t which is the smallest sum with this property. Name the set of original members of D' as D. Three cases may happen: 1) Y = t + |D|*d/n which means D is the solution to the original subset sum problem. 2) Y > t + |D|*d/n which means no answer set can be found for the original problem. 3) Y < t + |D|*d/n. In this case assign t=Y and repeat the problem. Since the value for the new t is increased, this case will not repeat exponentially. Therefore, the procedure terminates in polynomial time.

answered Feb 16, 2016 at 19:40

AliVar

1695 bronze badges

3 Comments

Ilya Gazman Over a year ago

Looks interesting! I edited the question based on your answer to be more specific in what am I looking for. Please tell me if this change your result.

j_random_hacker Over a year ago

You have the right idea, but case 3 is a problem -- it's not clear that t will increase fast enough. The problem occurs because you aren't choosing d quite right... Even if the smallest difference of any non-equal xi and xj is d, the smallest nonzero difference between the sums of any two disjoint sets of numbers in S could be arbitrarily smaller -- e.g. S={1, 3, 4, 5, 6.001, 7.001} has d=1 and thus d/n=1/6, but can produce a difference of (6.001+7.001)-(1+3+4+5) = 0.002. Instead, pick d to be the largest common divisor of all numbers in S ...

j_random_hacker Over a year ago

... Now the difference between any two disjoint sets of numbers in S must be an integer multiple of d, which means that case 3 can't happen -- no other subset E of S can map to an E' of S' that "sneaks in front of" the "right" D' (i.e. the one that maps back to D as per case 2). (Note: taking LCDs implies that the algorithm will only work for rational numbers, which is only a restriction in theory, since in practice irrational numbers can't be represented on computers anyway...)

Collectives™ on Stack Overflow

Smallest sum of subarray with sum greater than a given value

3 Answers 3

4 Comments

4 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

4 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related