Maximum subarray sum for Python

Question

There is a task on codewars that asks to do the following: The maximum sum subarray problem consists in finding the maximum sum of a contiguous subsequence in an array or list of integers. If the list is made up of only negative numbers, return 0 instead. So I tried to do this problem and I thought I succeeded but apparently it's not the case. Here's code I came up with:

def maxSequence(arr):
    sums = []
    for e in arr:
        for i in range(arr.index(e)+1, len(arr)): 
            e += arr[i]
            sums.append(e)
    if all(i<0 for i in arr) or not sums:
        return 0
    else:
        max_sum = max(sums)
        return max_sum

Tests show that around 90% times it's good but not every time and can't clear this task because of it. Some that didn't worked: 64 should equal 78 or 134 should equal 144 - so when it's not working, the result is less by around 10-15 so one number off I guess. Sadly, you can't see examples of these lists to see how it's not working there, it's just final result being wrong. I don't understand what's the case that isn't covered with my code. Can somebody explain what I should change to have 100% rate?

Can you also provide an example input and desired output please? EDIT: Sorry you mention that you can't provide an example. Without one, it's going to be difficult to give you a good answer. — kingkupps
– kingkupps, Commented Dec 7, 2019 at 19:18
The problem statement is not sufficiently clear. Given a sequence of numbers, what is a "maximum sum"? How is a "contiguous subsequence" defined? — norok2
– norok2, Commented Dec 7, 2019 at 19:21
This isn't a fix for your algorithm issues, but you should learn about enumerate, rather than using index(e). For example: for idx, val in enumerate(arr): — jarmod
– jarmod, Commented Dec 7, 2019 at 19:31

norok2 · Accepted Answer · 2020-12-27 13:04:01Z

2

There are two issues with your code:

you do not consider lenght-1 subsequences. You should add e in the sums list.
your nested loop starts from the index at which the first occurrence of e is observed and not the actual index of e. This is a problem in case of duplicates.

Hence, the following should work:

def maxSequence_fixed(arr):
    sums = []
    for j, e in enumerate(arr):
        sums.append(e)  # <-- HERE
        for i in range(j + 1, len(arr)): 
            e += arr[i]
            sums.append(e)
    if all(i<0 for i in arr) or not sums:
        return 0
    else:
        max_sum = max(sums)
        return max_sum

Note that there are more efficient / cleaner approaches than this, e.g.:

def max_sum_subseq(items):
    iter_items = iter(items)
    try:
        temp_sum = next(iter_items)
    except StopIteration:
        temp_sum = 0
    max_sum = temp_sum
    for item in iter_items:
        temp_sum = max(temp_sum + item, item)
        max_sum = max(temp_sum, max_sum)
    return max(max_sum, 0)

which is O(N) in time and O(1) in memory.

Note that the case of all-negative inputs is handled by max(max_sum, 0) just before return. In case that is not required, it is sufficient to return just max_sum.

An alternative approach would have been to use max(next(iter_items), 0) instead of just next(iter_items) as the first temp_sum value.

And a faster approach would avoid the relatively expensive max() calls:

def max_sum_subseq_fast(items):
    iter_items = iter(items)
    try:
        temp_sum = next(iter_items)
    except StopIteration:
        temp_sum = 0
    max_sum = temp_sum
    for item in iter_items:
        temp_sum += item
        if item > temp_sum:
            temp_sum = item
        if temp_sum > max_sum:
            max_sum = temp_sum
    return max_sum if max_sum > 0 else 0

Note that both max_sum_subseq() and max_sum_subseq_fast() can work on any Iterable (and do not require a Sequence, contrarily to the other solutions compared).

If one need to support only Sequences, then the block:

...
    iter_items = iter(items)
    try:
        temp_sum = next(iter_items)
    except StopIteration:
        temp_sum = 0
    max_sum = temp_sum
    for item in iter_items:
...

can be replaced with:

...
    if len(items) < 1:
        return 0
    max_sum = temp_sum = 0
    for item in items:
...

with only very marginal O(1) speed gain.

Another, much less efficient (but closer to your thinking) solution would be:

def max_sum_subseq_slow(items):
    max_sum = 0
    for i in range(len(items)):
        for j in range(i, len(items)):
            temp_sum = sum(items[i:j + 1])
            if temp_sum > max_sum:
                max_sum = temp_sum
    return max_sum

Some sanity checks below:

seqs = [
    [],
    [-1],
    [1, -1],
    [1, -1, 1],
    [1, -1, 1, 1],
    [1, -1, 1, 1, 1],
    [1, -1, -1, 1, 1],
]

funcs = maxSequence_OP, maxSequence_fixed, max_sum_subseq, max_sum_subseq_fast, max_sum_subseq_slow

for seq in seqs:
    print(seq)
    for func in funcs:
        print(func(seq))

# []
# 0
# 0
# 0
# 0
# 0

# [-1]
# 0
# 0
# 0
# 0
# 0

# [1, -1]
# 0
# 1
# 1
# 1
# 1

# [1, -1, 1]
# 1
# 1
# 1
# 1
# 1

# [1, -1, 1, 1]
# 2
# 2
# 2
# 2
# 2

# [1, -1, 1, 1, 1]
# 3
# 3
# 3
# 3
# 3

# [1, -1, -1, 1, 1]
# 1
# 2
# 2
# 2
# 2

More test cases can be generate using:

import random


seqs = [[random.randint(-10, 10) for _ in range(20)] for __ in range(100)]

Some timings:

funcs = maxSequence_OP, maxSequence_fixed, max_sum_subseq, max_sum_subseq_fast, max_sum_subseq_slow, maxSequence_Alexander

seq = [random.randint(-10, 10) for _ in range(1000)]
for func in funcs:
    print()
    print(func.__name__)
    %timeit func(seq)

# maxSequence_OP
# 10 loops, best of 3: 168 ms per loop

# maxSequence_fixed
# 10 loops, best of 3: 78 ms per loop

# max_sum_subseq
# 1000 loops, best of 3: 292 µs per loop

# max_sum_subseq_fast
# 10000 loops, best of 3: 66.3 µs per loop

# max_sum_subseq_slow
# 1 loop, best of 3: 1.21 s per loop

# maxSequence_Alexander
# 10000 loops, best of 3: 183 µs per loop

(EDITED to fix minor issues, include a faster solution and updated benchmarks, including a comparison with @Alexander's solution)

edited Dec 27, 2020 at 13:04

answered Dec 7, 2019 at 19:32

norok2

27.1k6 gold badges83 silver badges110 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Alexander Over a year ago

If you're doing timings, you may notice that my solution is significantly faster than your fastest solution above (although both solutions are still O(n), which is the main point considering the OP's O(n^2) solution).

norok2 Over a year ago

@Alexander Of course your solution may be faster than max_sum_subseq() in some cases (e.g. if you have all but one last value negative, I bet your solution will be slower as it will loop through the items twice). The gotcha is the max() call, which is the truly more expensive operation inside the loop. I preferred clarity (and generality -- the solution requires only Iterable and not Sequence like yours) over pure speed, given that this is mostly a didactic exercise. In a production settings, I would have made different choices. Anyway see the edits for faster sol. and update timing.

Alexander Over a year ago

You are correct to notice that it is more efficient to test if a new max value should be updated before blindly using max_val = max(current_val, max_val). I myself had that in my code (local_max = max(0, local_max)). Correcting that to if local_max < 0: local_max = 0, then our timings are virtually identical.

Alexander Over a year ago

I chose to short-circuit the initial check for all negatives (if not arr or all(val <= 0 for val in arr): return 0) because I view it as highly unlikely that most normal sequences would consist of mostly negative numbers. In fact, this would short circuit upon the first occurrence of the first positive number. Instead, one could simply remove that check and instead include return max(global_max, 0) at the end.

chadlei Over a year ago

Hi @norok2, just curious as to why you chose to use iter() in your O(n) solution instead of just adding an if statement to check for an empty list and then for-looping through the list itself?

|

Dhawan Gayash · Accepted Answer · 2020-09-07 21:10:44Z

1

This is a fairly famous problem, warranting its own wikipedia article: max-subarray.

The python code is actually quite elegant:

def max_sub_array(A):
  curr_max_sum = float('-inf')
  global_max = float('-inf')
  for i in A:
    curr_max_sum = max(curr_max_sum + i, i)
    global_max = max(curr_max_sum, global_max)
  return global_max

I think you are failing test cases because of the time complexity of your code. The solution should have O(N) time complexity and O(1) space complexity.

edited Sep 7, 2020 at 21:10

answered Dec 7, 2019 at 19:39

Dhawan Gayash

4631 gold badge4 silver badges17 bronze badges

Comments

Alexander · Accepted Answer · 2019-12-07 19:51:56Z

Keep track of a local and global max. If the local max drops below zero, start a new one as the new running sum will always be greater than the previous negative sum plus the new number. Return the global max.

def maxSequence(arr):
    if not arr or all(val <= 0 for val in arr):
        return 0
    local_max = global_max = arr[0]
    for val in arr[1:]:
        local_max += val
        local_max = max(0, local_max)  # Restart `local_max` if it becomes negative.
        if local_max > global_max:
            global_max = local_max
    return global_max

Your code is O(n^2) time complexity (vs O(n) above) which may be why it is failing some tests.

>>> maxSequence([0, -1, -2])
0

>>> maxSequence([1, 2, 3 -7, 10, 15])
25  # 10 + 15

>>> maxSequence([1, 2, 3 -7, 10])
10  # Final number 10

>>> maxSequence([6, 2, 3 -7, 10])
14  # 6 + 2 + 3 - 7 + 10

Bania Fonseca · Accepted Answer · 2020-09-04 14:53:29Z

0

from collections import namedtuple

def find_maximum_subarray(array):   
    ms = namedtuple('ms', ['low', 'high', 'sum'])
    ms.low = 0
    ms.high = 0
    ms.sum = array[0]
    next_subarray_sum = ms.sum
    i = 0
    for j in range(1, len(array)):
        next_subarray_sum = next_subarray_sum + array[j]
        if next_subarray_sum >= ms.sum:
            ms.sum = next_subarray_sum
            ms.low = i
            ms.high = j
        else:
            i = i + 1
            next_subarray_sum = next_subarray_sum - array[i-1] 
    return ms

answered Sep 4, 2020 at 14:53

Bania Fonseca

11 bronze badge

Collectives™ on Stack Overflow

Maximum subarray sum for Python

4 Answers 4

8 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

8 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related