Maximum subarray problem

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, the maximum subarray problem is the task of finding the contiguous subarray within a one-dimensional array of numbers (containing at least one positive number) which has the largest sum. For example, for the sequence of values −2, 1, −3, 4, −1, 2, 1, −5, 4; the contiguous subarray with the largest sum is 4, −1, 2, 1, with sum 6.

The problem was first posed by Ulf Grenander of Brown University in 1977, as a simplified model for maximum likelihood estimation of patterns in digitized images. A linear time algorithm was found soon afterwards by Jay Kadane of Carnegie-Mellon University (Bentley 1984).

Kadane's algorithm[edit]

Kadane's algorithm consists of a scan through the array values, computing at each position the maximum (positive sum) subarray ending at that position. This subarray is either empty (in which case its sum is zero) or consists of one more element than the maximum subarray ending at the previous position. Thus, the problem can be solved with the following code, expressed here in Python:

def max_subarray(A):
    max_ending_here = max_so_far = 0
    for x in A:
        max_ending_here = max(0, max_ending_here + x)
        max_so_far = max(max_so_far, max_ending_here)
    return max_so_far

A variation of the problem that does not allow zero-length subarrays to be returned in the case that the entire array consists of negative numbers can be solved with the following code:

def max_subarray(A):
    max_ending_here = max_so_far = A[0]
    for x in A[1:]:
        max_ending_here = max(x, max_ending_here + x)
        max_so_far = max(max_so_far, max_ending_here)
    return max_so_far

Here is a version in C++ which can optionally return the beginning and ending indices of the maximum subarray:

int sequence(std::vector<int> const & numbers)
{
        int max_so_far  = numbers[0], max_ending_here = numbers[0];
 
        // OPTIONAL: These variables can be added in to track the position of the subarray
        // size_t begin = 0;
        // size_t begin_temp = 0;
        // size_t end = 0;
 
        for(size_t i = 1; i < numbers.size(); i++)
        {
                if(max_ending_here < 0)
                {
                        max_ending_here = numbers[i];
 
                        // begin_temp = i;
                }
                else
                {
                        max_ending_here += numbers[i];
                }
 
                if(max_ending_here >= max_so_far )
                {
                        max_so_far  = max_ending_here;
 
                        // begin = begin_temp;
                        // end = i;
                }
        }
        return max_so_far ;
}

The algorithm can also be easily modified to keep track of the starting and ending indices of the maximum subarray (see commented code).

Because of the way this algorithm uses optimal substructures (the maximum subarray ending at each position is calculated in a simple way from a related but smaller and overlapping subproblem, the maximum subarray ending at the previous position) this algorithm can be viewed as a simple example of dynamic programming.

The runtime complexity of Kadane's algorithm is \mathcal{O}(n).

Divide and conquer[edit]

A divide and conquer algorithm is introduced in (Bentley, Jon (1984)); its time complexity is \mathcal{O}(n \log n).

Generalizations[edit]

Similar problems may be posed for higher-dimensional arrays, but their solution is more complicated; see, e.g., Takaoka (2002). Brodal & Jørgensen (2007) showed how to find the k largest subarray sums in a one-dimensional array, in the optimal time bound \mathcal{O}(n + k).

See also[edit]

References[edit]

External links[edit]