Tuesday, June 28, 2016

The Maximum Contiguous Subarray Problem

Among all the contiguous subarrays of a given array find one in which the sum of elements is maximized. This problem is known as the maximum contiguous subarray sum problem and we will see a few ways in which it can be solved. We will call the array $A$ and assume that it has $n$ numbers.

Brute Force

The brute force solution is to loop over all of the subarrays and take any one that maximizes the sum. There are $\cal O(n^2)$ subarrays and it takes linear time to compute the sum so the brute force approach gives a $\cal{O}(n^3)$ algorithm to solve this problem. Note that we print the sum at the end but it is easy to augment the algorithm to find the ends of the subarray.

Optimized Brute Force

With a bit more cleverness we can chop a linear factor. We keep track of the current sum as we go instead of fixing the right end and then computing the sum in a nested loop.

Divide and Conquer

The divide and conquer approach to problem solving is to take the original problem, split it into smaller subproblems, solve the subproblems recursively and use the solutions to the subproblems to construct the solution to the original problem. Now imagine that we split the array into two halves; what can we say about the optimal subarray ? It can be completely contained in the left half, completely contained in the right half, or it can start somewhere in the left half and end somewhere in the right half. The first two cases are subproblems of the problem we started with so we can solve them recursively. The third case can be solved independently.

Let's call the split point $m$. The left end can be computed by starting at position $m$ and going down to position $0$ while keeping track of the maximum sum. The right end can be computed similarly: start at position $m+1$ and go up to position $n-1$ while keeping track of the maximum sum. This is illustrated in the following figure.



The following is an implementation of the above idea.

The running time of this algorithm can be calculated by using the recurrence $T(n)=2T(n/2)+\cal \Theta(n)$ which gives $\cal O(n \lg n)$ after unfolding.

Dynamic Programming

There is another way to split the problem into smaller subproblems and use the solutions to the smaller subproblems to construct the solution to the original problem. Imagine that you are given the maximum sum ending at a position $p$, i.e., the maximum coniguous subarray sum with the additional constraint that the right end of the array is at position $p$. How can this information be used to find the maximum contiguous subarray ending at position $p+1$ ? First take a look at this figure to see the whole picture.



Note that $S$ is the best sum we can get by fixing the right end at position $p$ (adding cells to the left won't improve $S$). Now to construct the maximum sum ending at position $p+1$ we have two options:

  1. Extend the previous sum by $A[p + 1]$,
  2. Take $A[p + 1]$ and ignore $S$.

These are the only options we have, trying to extend just part of $S$ is useless because we know that $S$ is the maximum sum that ends at the previous position and so adding just a part of that sum is clearly going to give us a suboptimal answer.

This way of splitting problems gives us an efficient way to find the maximum contiguous subarray sum ending at a given position. Since the optimal solution for the entire array has to end somewhere, then we just solve the problem for each of the ending points and take the one that maximizes the sum.

The difference between the divide and conquer approach and the dynamic programming approach is that the subproblems overlap in the case of dynamic programming whereas they don't in the case of divide and conquer. The running time of the dynamic programming algorithm is $\cal O(n)$ because there are $n$ subproblems and computing the answer for each of them takes a constant amount of work.