Jump to content

User:Miguel.v/sandbox

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Miguel.v (talk | contribs) at 15:27, 6 April 2014. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The algorithm outlined below solves the longest increasing subsequence problem efficiently with arrays and binary searching. It processes the sequence elements in order, maintaining the longest increasing subsequence found so far. Denote the sequence values as X[0], X[1], etc. Then, after processing X[i], the algorithm will have stored values in two arrays:

M[j] — stores the index k of the smallest value X[k] such that there is an increasing subsequence of length j ending at X[k] on the range ki (note we have jki here, because j represents the length of the increasing subsequence, and k represents the index of its termination. Obviously, we can never have an increasing subsequence of length 13 ending at index 11. ki by definition).
P[k] — stores the index of the predecessor of X[k] in the longest increasing subsequence ending at X[k].

In addition the algorithm stores a variable L representing the length of the longest increasing subsequence found so far. Because the algorithm below uses zero-based numbering, for clarity we pad M with M[0], which goes unused so that M[j] corresponds to a subsequence of length j. A real implementation can skip M[0] and adjust the indices accordingly.

Note that, at any point in the algorithm, the sequence

X[M[1]], X[M[2]], ..., X[M[L]]

is nondecreasing. For, if there is an increasing subsequence of length i ending at X[M[i]], then there is also a subsequence of length i-1 ending at a smaller value: namely the one ending at X[P[M[i]]]. Thus, we may do binary searches in this sequence in logarithmic time.

The algorithm, then, proceeds as follows:

 X: input array of length N
 P: array of length N, initialized with invalid values
 M: array of length N + 1, initialized with invalid values

 L = 0
 for i in range 0 to N-1:
   // Binary search for the largest positive j ≤ L
   // such that X[M[j]] < X[i]
   lo = 1
   hi = L
   while lo ≤ hi:
     mid = (lo+hi)/2
     if X[M[mid]] < X[i]:
       lo = mid+1
     else:
       hi = mid-1

   // After searching, lo is 1 greater than the
   // length of the longest subsequence found so far 
   newL = lo

   // The predecessor of i is the last index of 
   // the subsequence of length newL-1
   P[i] = M[newL-1]

   if newL == L+1:
     // If we found a subsequence of new length,
     // update M and L
     M[newL] = i
     L = L+1
   else if X[i] < X[M[newL]]:
     // If we found a smaller last value for the
     // subsequence of length newL, only update M
     M[newL] = i

 // Reconstruct the longest increasing subsequence
 S: array of length L
 k: M[L]
 for i in range L-1 to 0:
   S[i] = X[k]
   k = P[k]

 return S

Because the algorithm performs a single binary search per sequence element, its total time can be expressed using Big O notation as O(n log n). Fredman (1975) discusses a variant of this algorithm, which he credits to Donald Knuth; in the variant that he studies, the algorithm tests whether each value X[i] can be used to extend the current longest increasing sequence, in constant time, prior to doing the binary search. With this modification, the algorithm uses at most n log2 nn log2log2 n + O(n) comparisons in the worst case, which is optimal for a comparison-based algorithm up to the constant factor in the O(n) term.[1]

  1. ^ Fredman, Michael L. (1975), "On computing the length of longest increasing subsequences", Discrete Mathematics, 11 (1): 29–35, doi:10.1016/0012-365X(75)90103-X.