List of gene prediction software

From Wikipedia, the free encyclopedia
Jump to: navigation, search

This list of gene prediction software is a compilation of software tools and web portals used for gene prediction.

Ab initio approaches[edit]

Name Description Species Links References
ATGpr identifying translational initiation sites in cDNA sequences
AUGUSTUS Eukaryote gene predictor Eukaryotes Predict Train AUGUSTUS [1]
BGF hidden Markov model (HMM) and dynamic programming based ab initio gene prediction program webserver
DIOGENES a system for fast detection of coding regions in short genomic sequences
Dragon Promoter Finder software for recognition of vertebrate RNA Polymerase II promoters
EUGENE Integrative gene finding Eukaryotes and prokaryotes EuGene webserver [2]
FGENESH HMM-based gene structure prediction (multiple genes, both chains) Eukaryotes webserver
FRAMED find genes and frameshift in G+C rich prokaryotic sequences Prokaryotes webserver [3]
GENIUS linking ORFs in complete genomes to protein 3D structures
geneid program to predict genes, exons, splice sites and other signals along a DNA sequence Eukaryotes webserver
GENEPARSER Parse a DNA sequence into introns and exons
GeneMark family of gene prediction programs Prokaryotes+Eukaryotes webserver [4]
GeneTack prediction of genes with frameshifts in prokaryotic genomes Prokaryotes webserver [5]
GENOMESCAN predicts locations and exon-intron structures of genes in genomic sequences from a variety of organisms. webserver
GENSCAN finding genes using Fourier transform webserver [6]
GLIMMER finding genes in microbial DNA Prokaryotes sourcecode webserver
GLIMMERHMM Eukaryotic gene-finding System Eukaryotes webserver [7]
GrailEXP predicts exons, genes, promoters, polyas, CpG islands, EST similarities, and repetitive elements within DNA sequence
mGene a Support-Vector Machine based system for finding genes Eukaryotes webserver [8]
mGene.ngs a SVM based system for finding genes using heterogeneous information (RNA-seq, tiling arrays) Eukaryotes [9]
MORGAN a decision tree system for finding genes in vertebrate DNA Eukaryotes
NIX web tool for combining results from different programs (GRAIL, FEX, HEXON, MZEF, GENEMARK, GENEFINDER, FGENE, BLAST, POLYAH, REPEATMASKER, TRNASCAN)
NNPP promoter prediction by neural network
NNSPLICE splice site prediction by neural network
ORF FINDER a graphical analysis tool which finds all open reading frames
Regulatory Sequence Analysis Tools provides a series of modular computer programs specifically designed for the detection of regulatory signals in non-coding sequences.
SPLICEPREDICTOR a method to identify potential splice sites in (plant) pre-mRNA by sequence inspection using Bayesian statistical models Eukaryotes
VEIL hidden markov model for finding genes in vertebrate DNA Server Eukaryotes

See also[edit]


  1. ^ Keller O, Kollmar M, Stanke M, Waack S. A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics. 2011 Mar 15;27(6):757-63.
  2. ^ Foissac, S., Gouzy, J.P., Rombauts, S., Mathé, C., Amselem, J., Sterck, L., Van de Peer, Y., Rouzé, P., Schiex, T. (2008) Genome Annotation in Plants and Fungi: EuGene as a model platform. Curr. Bioinform. 3, 87-97
  3. ^ Schiex T, Gouzy J, Moisan A, de Oliveira Y. FrameD: A flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences. Nucleic Acids Res. 2003 Jul 1;31(13):3738-41
  4. ^ Lukashin A. and Borodovsky M. (1998). "GeneMark.hmm: new solutions for gene finding". Nucleic Acids Research 26 (4): 1107–1115. doi:10.1093/nar/26.4.1107. PMC 147337. PMID 9461475. 
  5. ^ Antonov I. and Borodovsky M. (2010). "Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm". J Bioinform Comput Biol. 8 (3): 535–51. doi:10.1142/S0219720010004847. PMID 20556861. 
  6. ^ Burge C, Karlin S (1997). "Prediction of complete gene structures in human genomic DNA". J. Mol. Biol. 268 (1): 78–94. doi:10.1006/jmbi.1997.0951. PMID 9149143. 
  7. ^ Majoros, W.H., Pertea, M., and Salzberg, S.L. TigrScan and GlimmerHMM: two open-source ab initio eukaryotic gene-finders Bioinformatics 20 2878-2879
  8. ^ Schweikert, G., Zien, A., et al., Rätsch, G., mGene: Accurate SVM-based gene finding with an application to nematode genomes, Genome Res. Nov 2009; 19(11): 2133–2143.
  9. ^ Gan, X, Stegle, O., Behr, J, et al., Rätsch, G., Mott, R. Multiple reference genomes and transcriptomes for Arabidopsis thaliana, Nature 477, 419–423.