Open reading frame

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Sample sequence showing three different possible reading frames. Start codons are highlighted in purple, and stop codons are highlighted in red.

In molecular genetics, an open reading frame (ORF) is the part of a reading frame that has the potential to code for a protein or peptide. An ORF is a continuous stretch of codons that do not contain a stop codon (usually UAA, UAG or UGA).[1] An AUG codon within the ORF (not necessarily the first) may indicate where translation starts. The transcription termination site is located after the ORF, beyond the translation stop codon, because if transcription were to cease before the stop codon, an incomplete protein would be made during translation.[2] In eukaryotic genes with multiple exons, ORFs may span exons. These would be spliced into an ORF in the mRNA.

Biological significance[edit]

One common use of open reading frames is as one piece of evidence to assist in gene prediction. Long ORFs are often used, along with other evidence, to initially identify candidate protein coding regions in a DNA sequence.[3] The presence of an ORF does not necessarily mean that the region is ever translated. For example, in a randomly generated DNA sequence with an equal percentage of each nucleotide, a stop-codon would be expected once every 21 codons.[3] A simple gene prediction algorithm for prokaryotes might look for a start codon followed by an open reading frame that is long enough to encode a typical protein, where the codon usage of that region matches the frequency characteristic for the given organism's coding regions.[3] By itself even a long open reading frame is not conclusive evidence for the presence of a gene.[3] On the other hand, it has been proven that some short ORFs (sORFs) that lack the classical hallmarks of protein-coding genes (both from ncRNAs and mRNAs) can produce functional peptides.[4]

Six-frame translation[edit]

Since DNA is interpreted in groups of three nucleotides (codons), a DNA strand has three distinct reading frames. The double helix of a DNA molecule has two anti-parallel strands so, with the two strands having three reading frames each, there are six possible frame translations.

Example of a six-frame translation

ORF finding tools[edit]

  1. ORF Finder: The ORF Finder (Open Reading Frame Finder) is a graphical analysis tool which finds all open reading frames of a selectable minimum size in a user's sequence or in a sequence already in the database. This tool identifies all open reading frames using the standard or alternative genetic codes. The deduced amino acid sequence can be saved in various formats and searched against the sequence database using the BLAST server. The ORF Finder should be helpful in preparing complete and accurate sequence submissions. It is also packaged with the Sequin sequence submission software.(sequence analyser)
  2. ORF Investigator: ORF Investigator is a program which not only gives information about the coding and non coding sequences but also can perform pairwise global alignment of different gene/DNA regions sequences. The tool efficiently finds the ORFs for corresponding amino acid sequences and converts them into their single letter amino acid code, and provides their locations in the sequence. The pairwise global alignment between the sequences makes it convenient to detect the different mutations, including single nucleotide polymorphism. Needleman and Wunsch algorithms are used for the gene alignment. The ORF Investigator is written in the portable Perl programming language, and is therefore available to users of all common operating systems.
  3. ORFPredictor: OrfPredictor is a web server designed for identifying protein-coding regions in expressed sequence tag (EST)-derived sequences. For query sequences with a hit in BLASTX, the program predicts the coding regions based on the translation reading frames identified in BLASTX alignments, otherwise, it predicts the most probable coding region based on the intrinsic signals of the query sequences. The output is the predicted peptide sequences in the FASTA format, and a definition line that includes the query ID, the translation reading frame and the nucleotide positions where the coding region begins and ends. OrfPredictor facilitates the annotation of EST-derived sequences, particularly, for large-scale EST projects.

See also[edit]


  1. ^ "Open reading frame". U.S. National Library of Medicine. 2015-10-19. Retrieved 2015-10-22. 
  2. ^ Slonczewski, Joan; John Watkins Foster (2009). Microbiology: An Evolving Science. New York: W.W. Norton & Co. ISBN 978-0-393-97857-5. OCLC 185042615. 
  3. ^ a b c d Deonier, Richard; Simon Tavaré; Michael Waterman (2005). Computational Genome Analysis: an introduction. Springer-Verlag. p. 25. ISBN 0-387-98785-1. 
  4. ^ Zanet, J.; Benrabah, E.; Li, T.; Pelissier-Monier, A.; Chanut-Delalande, H.; Ronsin, B.; Bellen, H. J.; Payre, F.; Plaza, S. (2015). "Pri sORF peptides induce selective proteasome-mediated protein processing". Science 349 (6254): 1356–1358. doi:10.1126/science.aac5677. ISSN 0036-8075. 

External links[edit]

  • ORF Investigator ORF finding and Gene alignment program developed by Vivek Dhar Dwivedi and Sarad Kumar Mishra.
  • [1] - A web page where you can buy the book 'Gene Cloning and DNA analysis' by BROWN, T. A (2010)
  • Translation and Open Reading Frames
  • NCBI ORF finder - A web based interactive tool for predicting and analysing ORFs from nucleotide sequences.
  • ORF finder - A web based interactive tool for predicting and analysing ORFs from nucleotide sequences - hosted at
  • hORFeome V5.1 - A web based interactive tool for CCSB Human ORFeome Collection
  • ORF Marker - A free, fast and multi-platform desktop GUI tool for predicting and analyzing ORFs
  • StarORF - A multi-platform, java based, GUI tool for predicting and analyzing ORFs and obtaining reverse complement sequence
  • ORFPredictor - A webserver designed for ORF prediction and translation of a batch of EST or cDNA sequences