# Subsequence

(Redirected from Subsequently)

In mathematics, a subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements. For example, the sequence ${\displaystyle \langle A,B,D\rangle }$ is a subsequence of ${\displaystyle \langle A,B,C,D,E,F\rangle }$ obtained after removal of elements ${\displaystyle C}$, ${\displaystyle E}$, and ${\displaystyle F}$. The relation of one sequence being the subsequence of another is a preorder.

The subsequence should not be confused with substring ${\displaystyle \langle A,B,C,D\rangle }$ which can be derived from the above string ${\displaystyle \langle A,B,C,D,E,F\rangle }$ by deleting substring ${\displaystyle \langle E,F\rangle }$. The substring is a refinement of the subsequence.

## Common subsequence

Given two sequences X and Y, a sequence Z is said to be a common subsequence of X and Y, if Z is a subsequence of both X and Y. For example, if

${\displaystyle X=\langle A,C,B,D,E,G,C,E,D,B,G\rangle }$ and
${\displaystyle Y=\langle B,E,G,C,F,E,U,B,K\rangle }$

then a common subsequence of X and Y could be

${\displaystyle Z=\langle B,E,E\rangle .}$

This would not be the longest common subsequence, since Z only has length 3, and the common subsequence ${\displaystyle \langle B,E,E,B\rangle }$ has length 4. The longest common subsequence of X and Y is ${\displaystyle \langle B,E,G,C,E,B\rangle }$.

## Applications

Subsequences have applications to computer science,[1] especially in the discipline of bioinformatics, where computers are used to compare, analyze, and store DNA, RNA, and protein sequences.

Take two sequences of DNA containing 37 elements, say:

SEQ1 = ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA
SEQ2 = CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA

The longest common subsequence of sequences 1 and 2 is:

LCS(SEQ1,SEQ2) = CGTTCGGCTATGCTTCTACTTATTCTA

This can be illustrated by highlighting the 27 elements of the longest common subsequence into the initial sequences:

SEQ1 = ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA
SEQ2 = CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA

Another way to show this is to align the two sequences, i.e., to position elements of the longest common subsequence in a same column (indicated by the vertical bar) and to introduce a special character (here, a dash) in one sequence when two elements in the same column differ:

SEQ1 = ACGGTGTCGTGCTAT-G--C-TGATGCTGA--CT-T-ATATG-CTA-
| || ||| ||||| |  | |  | || |  || | || |  |||
SEQ2 = -C-GT-TCG-GCTATCGTACGT--T-CT-ATTCTATGAT-T-TCTAA

Subsequences are used to determine how similar the two strands of DNA are, using the DNA bases: adenine, guanine, cytosine and thymine.