Peptide sequence

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Peptide sequence, or amino acid sequence, is the order in which amino acid residues, connected by peptide bonds, lie in the chain in peptides and proteins. The sequence is generally reported from the N-terminal end containing free amino group to the C-terminal end containing free carboxyl group. Peptide sequence is often called protein sequence if it represents the primary structure of a protein.

Sequence notation and applications[edit]

Many peptide sequences have been added to sequence databases. These databases may use various notations to describe the peptide sequence. The full names of the amino acids are rarely given; instead, 3-letter or 1-letter abbreviations are usually recorded for conciseness.

Several deductions can be made from the sequence itself. Long stretches of hydrophobic residues may indicate transmembrane helices. These helices may indicate the peptide is a cell receptor. Certain residues indicate a beta sheet area. If full-length protein sequence is available, it is possible to estimate the isoelectric point of the protein. Methods for determining the peptide sequence include deduction from DNA sequence, Edman degradation, and mass spectrometry.

Techniques in sequence analysis can be applied to learn more about the peptide. These techniques generally consist of comparing the sequence to other sequences from sequence databases. Other sequences may have already been studied and determined to be significant. Findings about these sequences may be applicable to the sequence under investigation.

Amino acid notation[edit]

The use of single letters to indicate sets of similar residues is similar to the use of abbreviation codes for degenerate bases.[1][2]

Symbol Description Residues represented
x Any amino acid, or unknown All
B Aspartate derivatives D, N
Z Glutamate derivatives E, Q
Φ Hydrophobic V, I, L, F, W, Y, M
Ω Aromatic F, W, Y
Ψ Aliphatic V, I, L, M
π Small P, G, A, S
ζ Hydrophilic S, T, H, N, Q, E, D, K, R
+ Positively charged K, R
- Negatively charged D, E

See also[edit]


  1. ^ Aasland, Rein; Abrams, Charles; Ampe, Christophe; Ball, Linda J.; Bedford, Mark T.; Cesareni, Gianni; Gimona, Mario; Hurley, James H.; Jarchau, Thomas (2002-02-20). "Normalization of nomenclature for peptide motifs as ligands of modular protein domains". FEBS Letters. 513 (1): 141–144. doi:10.1016/S0014-5793(01)03295-1. ISSN 1873-3468. 
  2. ^ "A One-Letter Notation for Amino Acid Sequences*". European Journal of Biochemistry. 5 (2): 151–153. 1968-07-01. doi:10.1111/j.1432-1033.1968.tb00350.x. ISSN 1432-1033.