Clustal

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Clustal Omega
Developer(s) Des Higgins, Fabian Sievers, David Dineen, and Andreas Wilm (all at the Conway Institute, UCD)
Stable release 1.2.0 / 12 June 2013; 10 months ago (2013-06-12)
Written in C++
Operating system UNIX, Linux, Mac, MS-Windows
Type Bioinformatics tool
Licence GNU General Public License, version 2[1]
Website www.clustal.org/omega/
ClustalW/ClustalX
Developer(s) Gibson T. (EMBL), Thompson J. (CNRS), Higgins D. (UCD)
Stable release 2.1 / 17 November 2010; 3 years ago (2010-11-17)
Written in C++
Operating system UNIX, Linux, Mac, MS-Windows
Type Bioinformatics tool
Licence GNU Lesser General Public License [2]
Website www.clustal.org

Clustal is a widely used multiple sequence alignment computer program.[3] There are three main variations:

  • ClustalW: command line interface[4]
  • ClustalX: This version has a graphical user interface.[5]
  • Clustal Omega: Clustal Omega is the latest addition to the Clustal family. It offers a significant increase in scalability over previous versions, allowing hundreds of thousands of sequences to be aligned in only a few hours. It will also make use of multiple processors, where present. In addition, the quality of alignments is superior to previous versions, as measured by a range of popular benchmarks. (Note: Command line-only program.)[6][7]

All three are available for Windows, Mac OS, and Unix/Linux.

This program is available from the Clustal Homepage or European Bioinformatics Institute ftp server.

Input/Output[edit]

This program accepts a wide range of input formats, including NBRF/PIR, FASTA, EMBL/Swiss-Prot, Clustal, GCC/MSF, GCG9 RSF, and GDE.

The output format can be one or many of the following: Clustal, NBRF/PIR, GCG/MSF, PHYLIP, GDE, or NEXUS.

Multiple sequence alignment[edit]

There are three main steps:

  1. Do a pairwise alignment
  2. Create a guide tree (or use a user-defined tree)
  3. Use the guide tree to carry out a multiple alignment

These are done automatically when you select "Do Complete Alignment". Other options are "Do Alignment from guide tree" and "Produce guide tree only".

Setting[edit]

Users can align the sequences using the default setting, but occasionally it may be useful to customize one's own parameters.

The main parameters are the gap opening penalty, and the gap extension penalty.

Names[edit]

The guide tree in the initial programs was constructed via a UPGMA cluster analysis of the pairwise alignments, hence the name CLUSTAL.[8]cf.[9] The first four versions in 1988 had arabic numerals (1 to 4), whereas with the fifth version Des Higgins switched to roman numeral V in 1992.[8]cf.[10][11] In 1994 and in 1997, for the next two versions, the letters after the letter V were used and made to correspond to W for Weighted and X for X Window.[8]cf.[12][13] The name omega was chosen to mark a change from the previous ones.[8]

See also[edit]

References[edit]

  1. ^ See file COPYING, in source archive [1]. Accessed 2014-01-15.
  2. ^ "ClustalW / ClustalX: Multiple Sequence Alignment". Retrieved 1 October 2013. 
  3. ^ Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (2003). "Multiple sequence alignment with the Clustal series of programs". Nucleic Acids Res 31 (13): 3497–3500. doi:10.1093/nar/gkg500. PMC 168907. PMID 12824352. 
  4. ^ Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007). "ClustalW and ClustalX version 2". Bioinformatics 23 (21): 2947–2948. doi:10.1093/bioinformatics/btm404. PMID 17846036. 
  5. ^ Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997). "The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools". Nucleic Acids Research 25 (24): 4876–4882. doi:10.1093/nar/25.24.4876. PMC 147148. PMID 9396791. 
  6. ^ Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG (2011). "Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega". Mol Syst Biol 7 7 (539). doi:10.1038/msb.2011.75. 
  7. ^ "Clustal Omega source code". Retrieved 2014-01-15. 
  8. ^ a b c d Des Higgins, presentation at the SMBE 2012 conference in Dublin.
  9. ^ Higgins, D. G.; Sharp, P. M. (1988). "CLUSTAL: A package for performing multiple sequence alignment on a microcomputer". Gene 73 (1): 237–244. doi:10.1016/0378-1119(88)90330-7. PMID 3243435.  edit
  10. ^ Higgins, D. G.; Sharp, P. M. (1989). "Fast and sensitive multiple sequence alignments on a microcomputer". Computer applications in the biosciences : CABIOS 5 (2): 151–153. PMID 2720464.  edit
  11. ^ Higgins, D. G.; Bleasby, A. J.; Fuchs, R. (1992). "CLUSTAL V: Improved software for multiple sequence alignment". Computer applications in the biosciences : CABIOS 8 (2): 189–191. PMID 1591615.  edit
  12. ^ Thompson, J. D.; Higgins, D. G.; Gibson, T. J. (1994). "CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice". Nucleic Acids Research 22 (22): 4673–4680. doi:10.1093/nar/22.22.4673. PMC 308517. PMID 7984417.  edit
  13. ^ Thompson, J. D.; Gibson, T. J.; Plewniak, F.; Jeanmougin, F.; Higgins, D. G. (1997). "The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools". Nucleic Acids Research 25 (24): 4876–4882. doi:10.1093/nar/25.24.4876. PMC 147148. PMID 9396791.  edit

External links[edit]