Shortest common supersequence

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, the shortest common supersequence problem is a problem closely related to the longest common subsequence problem. Given two sequences X = < x1,...,xm > and Y = < y1,...,yn >, a sequence U = < u1,...,uk > is a common supersequence of X and Y if U is a supersequence of both X and Y. In other words, a shortest common supersequence of strings x and y is a shortest string z such that both x and y are subsequences of z.

A shortest common supersequence (scs) is a common supersequence of minimal length. In the shortest common supersequence problem, the two sequences X and Y are given and the task is to find a shortest possible common supersequence of these sequences. In general, an scs is not unique.

For two input sequences, an scs can be formed from a longest common subsequence (lcs) easily. For example, if X[1..m] = abcbdab and Y[1..n] = bdcaba, the lcs is Z[1..r] = bcba. By inserting the non-lcs symbols while preserving the symbol order, we get the scs: U[1..t] = abdcabdab.

It is quite clear that  r + t = m + n for two input sequences. However, for three or more input sequences this does not hold. Note also, that the lcs and the scs problems are not dual problems.


External links[edit]