Open reading frame: Difference between revisions

Content deleted Content added

Inline

Revision as of 16:46, 3 July 2008

An open reading frame (ORF) is a portion of an organism's genome which contains a sequence of bases that could potentially encode a protein. The beginning and end points of a given ORF are not equivalent to the ends of the mRNA, but they are usually contained within the mRNA sequence. In a gene, ORFs are located between the start-code sequence (initiation codon) and the stop-code sequence (termination codon). ORFs are usually encountered when sifting through pieces of DNA while trying to locate a gene. Since there exist variations in the start-code sequence of organisms with altered genetic code, the ORF will be identified differently. A typical ORF finder will employ algorithms based on existing genetic codes (including the altered ones) and all possible reading frames.

In fact, the existence of an ORF, especially a long one, is usually a good indication of the presence of a gene in the surrounding sequence. In this case, the ORF is part of the sequence that will be translated by the ribosomes, it will be long, and if the DNA is eukaryotic, the ORF may continue over gaps called introns. However, short ORFs can also occur by chance outside of genes. Usually ORFs outside genes are not very long and terminate after a few codons.

Once a gene has been sequenced it is important to determine the correct open reading frame (ORF). Theoretically, the DNA sequence can be read in six reading frames in organisms with double-stranded DNA; three in the forward and three in the reverse direction. The longest sequence without a stop codon usually determines the open reading frame. That is the case with prokaryotes. Eukaryotic mRNA is typically monocistronic and therefore only contains a single ORF. A problem arises when working with eukaryotic pre-mRNA: long parts of the DNA within an ORF are not translated (introns). When the aim is to find eukaryotic open reading frames it is necessary to have a look at the spliced messenger RNA mRNA.

For example, if you have 5'-UCUAAAGGUGAC-3' it has two out of three reading frames possible. This is one of the two possible mRNA sequences of the transcript, and we see that it can be read in three different ways:

UCU AAA GGU GAC
CUA AAG GUG etc
UAA AGG UGA etc

As you can see, the third possibility has a stop codon (UAA), thus only two of the three reading frames are open (aka have no stop codons).

External links

Translation and Open Reading Frames
NCBI ORF finder - A web based interactive tool for predicting and analysing ORFs from nucleotide sequences.
ORF finder - A web based interactive tool for predicting and analysing ORFs from nucleotide sequences - hosted at bioinformatics.org

@@ Line 1: / Line 1: @@
-An '''open reading frame''' or '''ORF''' is a portion of an organism's genome which contains a sequence of bases that could potentially [[translation_(genetics)|encode]] a protein. The start and stop ends of the ORF are not equivalent to the ends of the [[mRNA]], but they are usually contained within the mRNA.  In a [[gene]], ORFs are located between the start-code sequence (initiation [[codon]]) and  the stop-code sequence (termination codon).  ORFs are usually encountered when sifting through pieces of [[DNA]] while trying to locate a [[gene]]. Since there exist variations in the start-code sequence of organisms with altered genetic code, the ORF will be identified differently. A typical ORF finder will employ algorithms based on existing [[genetic code]]s (including the altered ones) and all possible reading frames.
+An '''open reading frame''' ('''ORF''') is a portion of an organism's [[genome]] which contains a sequence of [[base pair|bases]] that could potentially [[translation (genetics)|encode]] a [[protein]]. The beginning and end points of a given ORF are not equivalent to the ends of the [[mRNA]], but they are usually contained within the mRNA sequence.  In a [[gene]], ORFs are located between the start-code sequence (initiation [[codon]]) and  the stop-code sequence (termination codon).  ORFs are usually encountered when sifting through pieces of [[DNA]] while trying to locate a [[gene]]. Since there exist variations in the start-code sequence of organisms with altered genetic code, the ORF will be identified differently. A typical ORF finder will employ algorithms based on existing [[genetic code]]s (including the altered ones) and all possible reading frames.
-In fact, the existence of an ORF, especially a long one, is usually a good indication of the presence of a gene in the surrounding sequence.  In this case, the ORF is part of the sequence that will be translated by the [[ribosome]]s, it will be long, and if the DNA is eukaryotic, the ORF may continue over gaps called [[intron]]s. However, short ORFs can also occur by chance outside of [[gene]]s.  Usually ORFs outside [[gene]]s are not very long and terminate after a few codons.
+In fact, the existence of an ORF, especially a long one, is usually a good indication of the presence of a gene in the surrounding sequence.  In this case, the ORF is part of the sequence that will be translated by the [[ribosome]]s, it will be long, and if the DNA is eukaryotic, the ORF may continue over gaps called [[intron]]s. However, short ORFs can also occur by chance outside of genes.  Usually ORFs outside genes are not very long and terminate after a few codons.
 Once a gene has been sequenced it is important to determine the correct open reading frame (ORF). Theoretically, the DNA sequence can be read in six [[reading frame]]s in organisms with double-stranded DNA; three in the forward and three in the reverse direction. The longest sequence without a stop codon usually determines the open reading frame. That is the case with [[prokaryote]]s. Eukaryotic mRNA is typically [[monocistronic]] and therefore only contains a single ORF.  A problem arises when working with [[eukaryotic]] pre-mRNA: long parts of the DNA within an ORF are not translated ([[intron]]s). When the aim is to find eukaryotic open reading frames it is necessary to have a look at the spliced messenger RNA [[mRNA]].
+For example, if you have 5'-UC'''UAA'''AGGUGAC-3' it has two out of three reading frames possible.  This is one of the two possible mRNA sequences of the transcript, and we see that it can be read in three different ways:
-For example, if you have 5'-UC'''UAA'''AGGUGAC-3' it has 2 out of 3 reading frames possible.  This is one of the 2 possible mRNA sequences of the transcript, and we see that it can be read in 3 different ways:
 # UCU AAA GGU GAC
-#  CUA AAG GUG etc
+# CUA AAG GUG etc
 # '''UAA''' AGG UGA etc
-As you can see, the 3rd possibility has a [[stop codon]] ('''UAA'''), thus only 2 out of the 3 reading frames are open (aka have no stop codons).
+As you can see, the third possibility has a [[stop codon]] (''UAA''), thus only two of the three reading frames are open (aka have no stop codons).
+== See also ==
+* [[Sequerome]] - A [[sequence profiling tool]] that links each [[BLAST]] record to the [[NCBI]] ORF enabling complete ORF analysis of a BLAST report.
 == External links ==
-* [http://bioweb.uwlax.edu./GenWeb/Molecular/Seq_Anal/Translation/translation.html Translation and Open Reading Frames] - A Site explaining Open Reading Frames
+* [http://bioweb.uwlax.edu./GenWeb/Molecular/Seq_Anal/Translation/translation.html Translation and Open Reading Frames]
 * [http://www.ncbi.nlm.nih.gov/projects/gorf/ NCBI ORF finder] - A web based interactive tool for predicting and analysing ORFs from nucleotide sequences.
 * [http://bioinformatics.org/sms/orf_find.html ORF finder] - A web based interactive tool for predicting and analysing ORFs from nucleotide sequences - hosted at bioinformatics.org
-* [[Sequerome]] - A [[Sequence profiling tool]] that links each [[BLAST]] record to the NCBI ORF enabling complete ORF analysis of a BLAST report
 [[Category:Molecular genetics]]