Jump to content

Galaxy (computational biology): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Tnabtaf (talk | contribs)
→‎Accessibility: Added reference
Tnabtaf (talk | contribs)
Added several references throughout the article.
Line 26: Line 26:
}}
}}


'''Galaxy''' is "an open, web-based platform for performing accessible, reproducible, and transparent genomic science."<ref>{{cite pmid|20738864}}</ref> Galaxy is a [[scientific workflow system]] that aims to make [[computational biology]] accessible to research scientists that do not have [[computer programming]] experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used a general [[bioinformatics workflow management systems|bioinformatics workflow management system]].<ref>http://galaxyproject.org/wiki/Public%20Galaxy%20Servers</ref>
'''Galaxy'''<ref>{{cite pmid|20738864}}</ref><ref>{{cite pmid|20069535}}</ref><ref>{{cite pmid|18428782}}</ref> is a [[scientific workflow system|scientific workflow]] and [[data integration]]<ref>{{cite pmid|21531983}}</ref> platform that aims to make [[computational biology]] accessible to research scientists that do not have [[computer programming]] experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used a general [[bioinformatics workflow management systems|bioinformatics workflow management system]].<ref>http://galaxyproject.org/wiki/Public%20Galaxy%20Servers</ref>



== Scientific Workflows ==
== Scientific Workflows ==
Line 33: Line 34:


== Project Goals ==
== Project Goals ==

Galaxy is "an open, web-based platform for performing accessible, reproducible, and transparent genomic science."<ref>{{cite pmid|20738864}}</ref>


=== Accessibility ===
=== Accessibility ===


[[Computational biology]] is a specialized domain that often requires knowledge of [[computer programming]]. Galaxy aims to give biomedical researchers access to computational biology without also requiring them to understand computer programming.<ref>{{cite pmid|21775304}}</ref> Galaxy does this by stressing a simple user interface over the ability to build complex workflows. This design choice makes it relatively easy to build typical analyses, but more difficult to build complex workflows that include, for example, looping constructs. (See [[Taverna workbench]] for an example system that supports looping.)
[[Computational biology]] is a specialized domain that often requires knowledge of [[computer programming]]. Galaxy aims to give biomedical researchers access to computational biology without also requiring them to understand computer programming.<ref>{{cite pmid|21775304}}</ref><ref>{{cite pmid|17568012}}</ref> Galaxy does this by stressing a simple user interface<ref>{{cite pmid|20804568}}</ref> over the ability to build complex workflows. This design choice makes it relatively easy to build typical analyses, but more difficult to build complex workflows that include, for example, looping constructs. (See [[Taverna workbench]] for an example system that supports looping.)


=== Reproducibility ===
=== Reproducibility ===
Line 73: Line 76:
== Implementation ==
== Implementation ==


Galaxy is [[open-source software]] implemented using the [[Python (programming language)|Python programming language]]. It is developed by the Galaxy team<ref>http://galaxyproject.org/wiki/Galaxy%20Team</ref> at [[Penn State]] and [[Emory University]], and the [[#Community|Galaxy Community]].
Galaxy is [[open-source software]] implemented using the [[Python (programming language)|Python programming language]]. It is developed by the Galaxy team<ref>http://galaxyproject.org/wiki/Galaxy%20Team</ref> at [[Penn State]] and [[Emory University]], and the [[#Community|Galaxy Community]].<ref>{{cite pmid|21347127}}</ref>


== Community ==
== Community ==

Revision as of 16:50, 17 August 2011

Repository
Written inPython
Operating systemUnix-like
Available inEnglish
TypeScientific workflow system
LicenseSee Wiki
WebsiteGalaxyProject.org

Galaxy[1][2][3] is a scientific workflow and data integration[4] platform that aims to make computational biology accessible to research scientists that do not have computer programming experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used a general bioinformatics workflow management system.[5]


Scientific Workflows

Galaxy is a scientific workflow system. These systems provide a means to build multi-step computational analyses akin to a recipe. They typically provide a graphical user interface for specifying what data to operate on, what steps to take, and what order to do them in.

Project Goals

Galaxy is "an open, web-based platform for performing accessible, reproducible, and transparent genomic science."[6]

Accessibility

Computational biology is a specialized domain that often requires knowledge of computer programming. Galaxy aims to give biomedical researchers access to computational biology without also requiring them to understand computer programming.[7][8] Galaxy does this by stressing a simple user interface[9] over the ability to build complex workflows. This design choice makes it relatively easy to build typical analyses, but more difficult to build complex workflows that include, for example, looping constructs. (See Taverna workbench for an example system that supports looping.)

Reproducibility

Reproducibility is a key goal of science: When scientific results are published the publications should include enough information that others can repeat the experiment and get the same results. There have been many recent efforts to extend this goal from the bench (the "wet lab") to computational experiments (the "dry lab") as well. This has proved to be a more difficult task than initially expected.[10]

Galaxy supports reproducibility by capturing sufficient information about every step in a computational analysis, so that the analysis can be repeated, exactly, at any point in the future. This includes keeping track of all input, intermediate, and final datasets, as well as the parameters provided to, and the order of each step of the analysis.

Transparency

Galaxy supports transparency in scientific research by enabling researchers to share any of their Galaxy Objects either publicly, or with specific individuals. Shared items can be examined in detail, rerun at will and copied and modified to test hypotheses.

Galaxy Objects: Histories Workflows, Datasets and Pages

Galaxy objects are anything that can be saved, persisted, and shared in Galaxy:

Histories
Histories are computational analyses (recipes) run with specified input datasets, computational steps and parameters. Histories include all intermediate and output datasets as well.
Workflows
Workflows are computational analyses that specify all the steps (and parameters) in the analysis, but none of the data. Workflows are used to run the same analysis against multiple sets of input data.
Datasets
Datasets includes any input, intermediate, or output dataset, used or produced in an analysis.
Pages
Histories, workflows and datasets can include user-provided annotation. Galaxy Pages enables the creation of a virtual paper that describes the how and why of the overall experiment. Tight integration of Pages with Histories, Workflows, and Datasets supports this goal.

Availability

Galaxy is available:

  1. As a free public web server,[11] supported by the Galaxy Project.[12]. This server includes many bioinformatics tools that are widely useful in many areas of genomics research. Users can create logins, and save histories, workflows, and datasets on the server. These saved items can also be shared with others.
  2. As open-source software that can be downloaded, installed and customized to address specific needs.[13]. Galaxy can be installed locally or using a computing cloud.[14]
  3. Public web servers hosted by other organizations.[15] Several organizations with their own Galaxy installation have also opted to make those servers available to others.

Implementation

Galaxy is open-source software implemented using the Python programming language. It is developed by the Galaxy team[16] at Penn State and Emory University, and the Galaxy Community.[17]

Community

Galaxy is an open source project and the community includes users, organizations that install their own instance, Galaxy developers, and bioinformatics tool developers. The Galaxy project has mailing lists[18], a community wiki[19], and annual meetings.[20].

References

  1. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 20738864, please use {{cite journal}} with |pmid=20738864 instead.
  2. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 20069535, please use {{cite journal}} with |pmid=20069535 instead.
  3. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 18428782, please use {{cite journal}} with |pmid=18428782 instead.
  4. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 21531983, please use {{cite journal}} with |pmid=21531983 instead.
  5. ^ http://galaxyproject.org/wiki/Public%20Galaxy%20Servers
  6. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 20738864, please use {{cite journal}} with |pmid=20738864 instead.
  7. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 21775304, please use {{cite journal}} with |pmid=21775304 instead.
  8. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 17568012, please use {{cite journal}} with |pmid=17568012 instead.
  9. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 20804568, please use {{cite journal}} with |pmid=20804568 instead.
  10. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 19174838, please use {{cite journal}} with |pmid=19174838 instead.
  11. ^ http://usegalaxy.org/
  12. ^ http://galaxyproject.org/
  13. ^ http://getgalaxy.org/
  14. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 21210983, please use {{cite journal}} with |pmid=21210983 instead.
  15. ^ http://galaxyproject.org/wiki/Public%20Galaxy%20Servers
  16. ^ http://galaxyproject.org/wiki/Galaxy%20Team
  17. ^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 21347127, please use {{cite journal}} with |pmid=21347127 instead.
  18. ^ http://galaxyproject.org/wiki/Mailin%20Lists
  19. ^ http://galaxyproject.org/wiki
  20. ^ http://galaxyproject.org/wiki/Events