Bioinformatics workflow management system

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A bioinformatics workflow management system is a specialized form of workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, that relate to bioinformatics.

There are currently many different workflow systems. Some have been developed more generally as scientific workflow systems for use by scientists from many different disciplines like astronomy and earth science. All such systems are based on an abstract representation of how a computation proceeds in the form of a directed graph, where each node represents a task to be executed and edges represent either data flow or execution dependencies between different tasks. Each system typically provides visual front-end allowing the user to build and modify complex applications with little or no programming expertise.

Examples[edit]

  • BioExtract: a web-based system for querying biomolecular sequence data, executing analytic tools on the resulting extracts, and constructing workflows composed of such queries and tools.
  • Anduril bioinformatics and image analysis
  • BioBIKE
  • Chipster
  • Discovery Net: one of the earliest examples of a scientific workflow system, later commercialized as InforSense which was then acquired by IDBS.
  • Galaxy: initially targeted at genomics
  • GeneProf: web based functional genomics experiments, e.g. RNA-seq or ChIP-seq
  • OnlineHPC Online workflow designer based on Taverna
  • Tavaxy:[1] A cloud-based bioinformatics workflow system that integrates features from both Taverna and Galaxy for NGS data analysis.
  • Taverna workbench:[2] an early e-Science system widely used in bioinformatics
  • VisTrails
  • Anvaya: Anvaya is a software application consisting of interface to Bioinformatics tools and databases in a workflow environment, to execute the set of analyses tools in series or in parallel. One of the unique features of Anvaya is the rules engine that defines rules for logical connection between the existing tools. Anvaya offers the user, novel functionality to carry out exhaustive comparative analysis via custom tools which are tools with new functionality not available in standard tools and built-in PERL parsers.

Comparisons between workflow systems[edit]

With a large number of bioinformatics workflow systems to chose from, it becomes difficult to understand and compare the features of the different workflow systems. There has been little work conducted in evaluating and comparing the systems from a bioinformatician's perspective, especially when it comes to comparing the data types they can deal with, the in-built functionalities that are provided to the user or even their performance or usability. Examples of existing comparisons include

  • The paper "Scientific workflow systems-can one size fit all?",[3] which provides a high-level framework for comparing workflow systems based on their control flow and data flow properties. The systems compared include Discovery Net, Taverna, Triana, Kepler as well as Yawl and BPEL.
  • The paper "Meta-workflows: pattern-based interoperability between Galaxy and Taverna" [4] which provides a more user-oriented comparison between Taverna and Galaxy in the context of enabling interoperability between both systems.
  • The infrastructure paper "Delivering ICT Infrastructure for Biomedical Research" [5] compares two workflow systems, Anduril and Chipster, in terms of infrastructure requirements in a cloud-delivery model.

See also[edit]

References[edit]

  1. ^ Abouelhoda, M.; Issa, S.; Ghanem, M. (2012). "Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support". BMC Bioinformatics 13: 77. doi:10.1186/1471-2105-13-77. PMC 3583125. PMID 22559942.  edit
  2. ^ Oinn, T.; Addis, M.; Ferris, J.; Marvin, D.; Senger, M.; Greenwood, M.; Carver, T.; Glover, K.; Pocock, M. R.; Wipat, A.; Li, P. (2004). "Taverna: A tool for the composition and enactment of bioinformatics workflows". Bioinformatics 20 (17): 3045–3054. doi:10.1093/bioinformatics/bth361. PMID 15201187.  edit
  3. ^ Curcin, V; Ghanem, M (2008), Scientific workflow systems - can one size fit all?, Biomedical Engineering Conference, 2008. CIBEC 2008, IEEE, doi:10.1109/CIBEC.2008.4786077 
  4. ^ Abouelhoda, M; Ghanem, M; Alaa, S (2010), Meta-workflows: pattern-based interoperability between Galaxy and Taverna, Wands '10 Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science, ACM, doi:10.1145/1833398.1833400 
  5. ^ Nyrönen, TH; Laitinen, J et al. (2012), Delivering ICT infrastructure for biomedical research, Proceedings of the WICSA/ECSA 2012 Companion Volume (WICSA/ECSA '12), ACM, doi:10.1145/2361999.2362006 

External links[edit]

  • Oinn, T.; Greenwood, M.; Addis, M.; Alpdemir, M. N.; Ferris, J.; Glover, K.; Goble, C.; Goderis, A.; Hull, D.; Marvin, D.; Li, P.; Lord, P.; Pocock, M. R.; Senger, M.; Stevens, R.; Wipat, A.; Wroe, C. (2006). "Taverna: Lessons in creating a workflow environment for the life sciences". Concurrency and Computation: Practice and Experience 18 (10): 1067–1100. doi:10.1002/cpe.993.  edit
  • Yu, J.; Buyya, R. (2005). "A taxonomy of scientific workflow systems for grid computing". ACM SIGMOD Record 34 (3): 44. doi:10.1145/1084805.1084814.  edit from the ACM SIGMOD Record
  • Curcin, V.; Ghanem, M. (2008). Scientific workflow systems - can one size fit all?. pp. 1–9. doi:10.1109/CIBEC.2008.4786077.  edit paper in CIBEC'08 comparing multiple workflow systems for bioinformatics applications