Clustal

From Wikipedia, the free encyclopedia
  (Redirected from ClustalW)
Jump to: navigation, search
CLUSTAL
Developer(s)
  • Des Higgins
  • Fabian Sievers
  • David Dineen
  • Andreas Wilm (all at the Conway Institute, UCD)
Stable release
1.2.2 / 1 July 2016; 9 months ago (2016-07-01)
Written in C++
Operating system UNIX, Linux, Mac, MS-Windows
Type Bioinformatics tool
Licence GNU General Public License, version 2[1]
Website www.clustal.org/omega/

Clustal is a series of widely used computer programs used in Bioinformatics for multiple sequence alignment.[2]

History[edit]

There have been many incarnations of Clustal that are listed below:

  • Clustal: The original software for progressive alignment based on a phylogenetic tree.[3]
  • ClustalV: A rewrite of the original Clustal package that included phylogenetic tree reconstruction on the final alignment for the first time.[4]
  • ClustalW: command line interface[5]
  • ClustalX: This version has a graphical user interface.[6]
  • ClustalΩ (Omega): The current standard version.[7][8]

The papers describing the clustal software have been very highly cited, with two of them amongst the most cited papers of all time.[9]

The more recent version of the software available for Windows, Mac OS, and Unix/Linux. It is also commonly used via a web interface at its own home page or hosted by the European Bioinformatics Institute.

Names[edit]

The guide tree in the initial programs was constructed via a UPGMA cluster analysis of the pairwise alignments, hence the name CLUSTAL.[10]cf.[11] The first four versions in 1988 had Arabic numerals (1 to 4), whereas with the fifth version Des Higgins switched to Roman numeral V in 1992.[10]cf.[12][13] In 1994 and in 1997, for the next two versions, the letters after the letter V were used and made to correspond to W for Weighted and X for X Window.[10]cf.[14][15] The name omega was chosen to mark a change from the previous ones.[10]

Function[edit]

All variants of Clustal align sequences by three main steps:

  1. Do a pairwise alignment
  2. Create a guide tree (or use a user-defined tree)
  3. Use the guide tree to carry out a multiple alignment

These are done automatically when you select "Do Complete Alignment". Other options are "Do Alignment from guide tree and phylogeny" and "Produce guide tree only".

Input/Output[edit]

This program accepts a wide range of input formats, including NBRF/PIR, FASTA, EMBL/Swiss-Prot, Clustal, GCC/MSF, GCG9 RSF, and GDE.

The output format can be one or many of the following: Clustal, NBRF/PIR, GCG/MSF, PHYLIP, GDE, or NEXUS.

Settings[edit]

Many settings can be modified to adapt the alignment algorithm to different circumstances. The main parameters are the gap opening penalty, and the gap extension penalty.

See also[edit]

References[edit]

  1. ^ See file COPYING, in source archive [1]. Accessed 2014-01-15.
  2. ^ Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (2003). "Multiple sequence alignment with the Clustal series of programs". Nucleic Acids Res. 31 (13): 3497–3500. doi:10.1093/nar/gkg500. PMC 168907Freely accessible. PMID 12824352. 
  3. ^ Higgins DG, Sharp PM (December 1988). "CLUSTAL: a package for performing multiple sequence alignment on a microcomputer". Gene. 73 (1): 237–44. doi:10.1016/0378-1119(88)90330-7. PMID 3243435. 
  4. ^ Higgins DG, Bleasby AJ, Fuchs R (April 1992). "CLUSTAL V: improved software for multiple sequence alignment". Comput. Appl. Biosci. 8 (2): 189–91. doi:10.1093/bioinformatics/8.2.189. PMID 1591615. 
  5. ^ Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007). "ClustalW and ClustalX version 2". Bioinformatics. 23 (21): 2947–2948. doi:10.1093/bioinformatics/btm404. PMID 17846036. 
  6. ^ Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997). "The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools". Nucleic Acids Research. 25 (24): 4876–4882. doi:10.1093/nar/25.24.4876. PMC 147148Freely accessible. PMID 9396791. 
  7. ^ Sievers, Fabian; Higgins, DesmondG. (2014-01-01). Russell, David J, ed. Multiple Sequence Alignment Methods. Methods in Molecular Biology. Humana Press. pp. 105–116. doi:10.1007/978-1-62703-646-7_6. ISBN 9781627036450. 
  8. ^ Sievers, Fabian; Higgins, Desmond G. (2002-01-01). Current Protocols in Bioinformatics. John Wiley & Sons, Inc. doi:10.1002/0471250953.bi0313s48. ISBN 9780471250951. 
  9. ^ Van Noorden, R.; Maher, B.; Nuzzo, R. (2014). "The top 100 papers: Nature explores the most-cited research of all time". Nature. London. 514 (7524): 550–3. doi:10.1038/514550a. PMID 25355343. 
  10. ^ a b c d Des Higgins, presentation at the SMBE 2012 conference in Dublin.
  11. ^ Higgins, D. G.; Sharp, P. M. (1988). "CLUSTAL: A package for performing multiple sequence alignment on a microcomputer". Gene. 73 (1): 237–244. doi:10.1016/0378-1119(88)90330-7. PMID 3243435. 
  12. ^ Higgins, D. G.; Sharp, P. M. (1989). "Fast and sensitive multiple sequence alignments on a microcomputer". Computer Applications in the Biosciences (CABIOS). 5 (2): 151–153. doi:10.1093/bioinformatics/5.2.151. PMID 2720464. 
  13. ^ Higgins, D. G.; Bleasby, A. J.; Fuchs, R. (1992). "CLUSTAL V: Improved software for multiple sequence alignment". Computer Applications in the Biosciences (CABIOS). 8 (2): 189–191. doi:10.1093/bioinformatics/8.2.189. PMID 1591615. 
  14. ^ Thompson, J. D.; Higgins, D. G.; Gibson, T. J. (1994). "CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice". Nucleic Acids Research. 22 (22): 4673–4680. doi:10.1093/nar/22.22.4673. PMC 308517Freely accessible. PMID 7984417. 
  15. ^ Thompson, J. D.; Gibson, T. J.; Plewniak, F.; Jeanmougin, F.; Higgins, D. G. (1997). "The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools". Nucleic Acids Research. 25 (24): 4876–4882. doi:10.1093/nar/25.24.4876. PMC 147148Freely accessible. PMID 9396791. 

External links[edit]