A biochemical cascade is a series of chemical reactions in which the products of one reaction are consumed in the next reaction. These cascades facilitate the transformation or generation of complex molecules in small steps. At each step, various controlling factors are involved to regulate cellular reactions, responding effectively to cues about their changing internal and external environments. Chemical reactions are orchestrated by complex molecular networks, which consist of proteins/enzymes or RNAs (second messengers), connected by activation or synthesis in biological processes.
- 1 Introduction
- 2 Pathway construction
- 3 Pathway-Related Databases and Tools
- 4 Pathway-Oriented Approaches
- 5 Applications of Pathway Analysis in Disease
- 6 References
- 7 External links
In biochemistry, several important enzymatic cascades and signal transduction cascades participate in metabolic pathways or signalling networks, in which enzymes are usually involved to catalyze the reactions. For example, the tissue factor pathway in the coagulation cascade of secondary hemostasis is the primary pathway leading to fibrin formation, therefore, the initiation of blood coagulation. The pathways are a series of reactions, in which a zymogen (inactive enzyme precursor) of a serine protease and its glycoprotein co-factors are activated to become active components that then catalyze the next reaction in the cascade, ultimately resulting in cross-linked fibrin.
Another example, sonic hedgehog signaling pathway is one of the key regulators of embryonic development and is present in all bilaterians. Different parts of the embryo have different concentrations of hedgehog signaling proteins, which give cells information to make the embryo develop properly and correctly into a head or a tail. When the pathway malfunctions, it can result in diseases like basal cell carcinoma. Recent studies point to the role of hedgehog signaling in regulating adult stem cells involved in maintenance and regeneration of adult tissues. The pathway has also been implicated in the development of some cancers. Drugs that specifically target hedgehog signaling to fight diseases are being actively developed by a number of pharmaceutical companies. Most biochemical cascades are series of events, in which one event triggers the next, in a linear fashion. Negative cascades, however, include events that are in a circular fashion, or can cause or be caused by multiple events.
Biochemical cascades include:
- The Complement system
- The Insulin Signaling Pathway
- The Sonic hedgehog Signaling Pathway
- The Wnt signaling pathway
- The JAK-STAT signaling pathway
- The Adrenergic receptor Pathways
- The Acetylcholine receptor Pathways
Negative cascades include:
Pathway building has been performed by individual groups studying a network of interest (e.g., immune signaling pathway) as well as by large bioinformatics consortia (e.g., the Reactome Project) and commercial entities (e.g., Ingenuity Systems). Pathway building is the process of identifying and integrating the entities, interactions, and associated annotations, and populating the knowledge base. Pathway construction can have either a data-driven objective (DDO) or a knowledge-driven objective (KDO). Data-driven pathway construction is used to generate relationship information of genes or proteins identified in a specific experiment such as a microarray study. Knowledge-driven pathway construction entails development of a detailed pathway knowledge base for particular domains of interest, such as a cell type, disease, or system. The curation process of a biological pathway entails identifying and structuring content, mining information manually and/or computationally, and assembling a knowledgebase using appropriate software tools. A schematic illustrating the major steps involved in the data-driven and knowledge-driven construction processes.
For either DDO or KDO pathway construction, the first step is to mine pertinent information from relevant information sources about the entities and interactions. The information retrieved is assembled using appropriate formats, information standards, and pathway building tools to obtain a pathway prototype. The pathway is further refined to include context-specific annotations such as species, cell/tissue type, or disease type. The pathway can then be verified by the domain experts and updated by the curators based on appropriate feedback. Recent attempts to improve knowledge integration have led to refined classifications of cellular entities, such as GO, and to the assembly of structured knowledge repositories. Data repositories, which contain information regarding sequence data, metabolism, signaling, reactions, and interactions are a major source of information for pathway building. A few useful databases are described in the following table.
|Database||Curation Type||GO Annotation (Y/N)||Description|
|1. Protein-protein interactions databases|
|BIND||Manual Curation||N||200,000 documented biomolecular interactions and complexes|
|MINT||Manual Curation||N||Experimentally verified interactions|
|HPRD||Manual Curation||N||Elegant and comprehensive presentation of the interactions, entities and evidences|
|MPact||Manual and Automated Curation||N||Yeast interactions. A part of MIPS|
|DIP||Manual and Automated Curation||Y||Experimentally determined interactions|
|IntAct||Manual Curation||Y||Database and analysis system of binary and multi-protein interactions|
|PDZBase||Manual Curation||N||PDZ Domain containing proteins|
|GNPV||Manual and Automated Curation||Y||Based on specific experiments and literature|
|BioGrid||Manual Curation||Y||Physical and genetic interactions|
|UniHi||Manual and Automated Curation||Y||Comprehensive human protein interactions|
|OPHID||Manual Curation||Y||Combines PPI from BIND, HPRD, and MINT|
|2. Metabolic Pathway databases|
|EcoCyc||Manual and Automated Curation||Y||Entire genome and biochemical machinery of E. Coli|
|MetaCyc||Manual Curation||N||Pathways of over 165 species|
|HumanCyc||Manual and Automated Curation||N||Human metabolic pathways and the human genome|
|BioCyc||Manual and Automated Curation||N||Collection of databases for several organism|
|3. Signaling Pathway databases|
|KEGG||Manual Curation||Y||Comprehensive collection of pathways such as human disease, signaling, genetic information processing pathways. Links to several useful databases|
|PANTHER||Manual Curation||N||Compendium of metabolic and signaling pathways built using CellDesigner. Pathways can be downloaded in SBML format|
|Reactome||Manual Curation||Y||Hierarchical layout. Extensive links to relevant databases such as NCBI, ENSEMBL, UNIPROT, HAPMAP, KEGG, CHEBI, PubMed, GO. Follows PSI-MI standards|
|Biomodels||Manual Curation||Y||Domain experts curated biological connection maps and associated mathematical models|
|STKE||Manual Curation||N||Repository of canonical pathways|
|Ingenuity Systems||Manual Curation||Y||Commercial mammalian biological knowledgebase about genes, drugs, chemical, cellular and disease processes, and signaling and metabolic pathways|
|PID||Manual Curation||Y||Compendium of several highly structured, assembled signaling pathways|
|BioPP||Manual and Automated Curation||Y||Repository of biological pathways built using CellDesigner|
Legend: Y – Yes, N – No; BIND – Biomolecular Interaction Network Database, DIP – Database of Interacting Proteins, GNPV – Genome Network Platform Viewer, HPRD = Human Protein Reference Database, MINT – Molecular Interaction database, MIPS – Munich Information center for Protein Sequences, UNIHI – Unified Human Interactome, OPHID – Online Predicted Human Interaction Database, EcoCyc – Encyclopaedia of E. Coli Genes and Metabolism, MetaCyc – aMetabolic Pathway database, KEGG – Kyoto Encyclopedia of Genes and Genomes, PANTHER – Protein Analysis Through Evolutionary Relationship database, STKE – Signal Transduction Knowledge Environment, PID – The Pathway Interaction Database, BioPP – Biological Pathway Publisher. A comprehensive list of resources can be found at http://www.pathguide.org.
Pathway-Related Databases and Tools
The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource (http://www.genome.jp/kegg/) provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES), the chemical space (KEGG LIGAND), wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY), and ontologies for pathway reconstruction (BRITE database). The KEGG PATHWAY database is a collection of manually drawn pathway maps for metabolism, genetic information processing, environmental information processing such as signal transduction, ligand–receptor interaction and cell communication, various other cellular processes and human diseases, all based on extensive survey of published literature.
Gene Map Annotator and Pathway Profiler (GenMAPP) (http://www.genmapp.org/) a free, open-source, stand-alone computer program is designed for organizing, analyzing, and sharing genome scale data in the context of biological pathways. GenMAPP database support multiple gene annotations and species as well as custom species database creation for a potentially unlimited number of species. Pathway resources are expanded by utilizing homology information to translate pathway content between species and extending existing pathways with data derived from conserved protein interactions and coexpression. A new mode of data visualization including time-course, single nucleotide polymorphism (SNP), and splicing, has been implemented with GenMAPP database to support analysis of complex data. GenMAPP also offers innovative ways to display and share data by incorporating HTML export of analyses for entire sets of pathways as organized web pages (http://www.genmapp.org/tutorials/Converting-MAPPs-between-species.pdf). In short, GenMAPP provides a means to rapidly interrogate complex experimental data for pathway-level changes in a diverse range of organisms.
Given the genetic makeup of an organism, the complete set of possible reactions constitutes its reactome. Reactome, located at http://www.reactome.org is a curated, peer-reviewed resource of human biological processes/pathway data. The basic unit of the Reactome database is a reaction; reactions are then grouped into causal chains to form pathways  The Reactome data model allows us to represent many diverse processes in the human system, including the pathways of intermediary metabolism, regulatory pathways, and signal transduction, and high-level processes, such as the cell cycle. Reactome provides a qualitative framework, on which quantitative data can be superimposed. Tools have been developed to facilitate custom data entry and annotation by expert biologists, and to allow visualization and exploration of the finished dataset as an interactive process map. Although the primary curational domain is pathways from Homo sapiens, electronic projections of human pathways onto other organisms are regularly created via putative orthologs, thus making Reactome relevant to model organism research communities. The database is publicly available under open source terms, which allows both its content and its software infrastructure to be freely used and redistributed. Studying whole transcriptional proﬁles and cataloging protein–protein interactions has yielded much valuable biological information, from the genome or proteome to the physiology of an organism, an organ, a tissue or even a single cell. The Reactome database containing a framework of possible reactions which, when combined with expression and enzyme kinetic data, provides the infrastructure for quantitative models, therefore, an integrated view of biological processes, which links such gene products and can be systematically mined by using bioinformatics applications. Reactome data available in a variety of standard formats, including BioPAX, SBML and PSI-MI, and also enable data exchange with other pathway databases, such as the Cycs, KEGG and amaze, and molecular interaction databases, such as BIND and HPRD. The next data release will cover apoptosis, including the death receptor signaling pathways, and the Bcl2 pathways, as well as pathways involved in hemostasis. Other topics currently under development include several signaling pathways, mitosis, visual phototransduction and hematopoeisis. In summary, Reactome provides high-quality curated summaries of fundamental biological processes in humans in a form of biologist-friendly visualization of pathways data, and is an open-source project.
In the post-genomic age, high-throughput sequencing and gene/protein profiling techniques have transformed biological research by enabling comprehensive monitoring of a biological system, yielding a list of differentially expressed genes or proteins, which is useful in identifying genes that may have roles in a given phenomenon or phenotype. With DNA microarrays and genome-wide gene engineering, it is possible to screen global gene expression profiles to contribute a wealth of genomic data to the public domain. With RNA interference, it is possible to distill the inferences contained in the experimental literature and primary databases into knowledge bases that consist of annotated representations of biological pathways. In this case, individual genes and proteins are known to be involved in biological processes, components, or structures, as well as how and where gene products interact with each other. Pathway-oriented approaches for analyzing microarray data, by grouping long lists of individual genes, proteins, and/or other biological molecules according to the pathways they are involved in into smaller sets of related genes or proteins, which reduces the complexity, have proven useful for connecting genomic data to specific biological processes and systems. Identifying active pathways that differ between two conditions can have more explanatory power than a simple list of different genes or proteins. In addition, a large number of pathway analytic methods exploit pathway knowledge in public repositories such as Gene Ontology (GO) or Kyoto Encyclopedia of Genes and Genomes (KEGG), rather than inferring pathways from molecular measurements. Furthermore, different research focuses have given the word “pathway” different meanings. For example, ‘pathway’ can denote a metabolic pathway involving a sequence of enzyme-catalyzed reactions of small molecules, or a signaling pathway involving a set of protein phosphorylation reactions and gene regulation events. Therefore, the term “pathway analysis” has a very broad application. For instance, it can refer to the analysis physical interaction networks (e.g., protein–protein interactions), kinetic simulation of pathways, and steady-state pathway analysis (e.g., flux-balance analysis), as well as its usage in the inference of pathways from expression and sequence data. Several functional enrichment analysis tools  and algorithms  have been developed to enhance data interpretation. The existing knowledge base–driven pathway analysis methods in each generation have been summarized in recent literature.
Applications of Pathway Analysis in Disease
Colorectal cancer (CRC)
A program package MatchMiner was used to scan HUGO names for cloned genes of interest are scanned, then are input into GoMiner (online at http://genomebiology.com/2003/4/4/R28), which leveraged the GO to identify the biological processes, functions and components represented in the gene profile. Also, Database for Annotation, Visualization, and Integrated Discovery (DAVID) (http://genomebiology.com/2003/4/9/R60) and KEGG database (http://www.genome.ad.jp/kegg/) can be used for the analysis of microarray expression data and the analysis of each GO biological process (P), cellular component (C), and molecular function (F) ontology. In addition, DAVID tools can be used to analyze the roles of genes in metabolic pathways and show the biological relationships between genes or gene-products and may represent metabolic pathways. These two databases also provide bioinformatics tools online to combine specific biochemical information on a certain organism and facilitate the interpretation of biological meanings for experimental data. By using a combined approach of Microarray-Bioinformatic technologies, a potential metabolic mechanism contributing to colorectal cancer (CRC) has been demonstrated  Several environmental factors may be involved in a series of points along the genetic pathway to CRC. These include genes associated with bile acid metabolism, glycolysis metabolism and fatty acid metabolism pathways, supporting a hypothesis that some metabolic alternations observed in colon carcinoma may occur in the development of CRC.
Parkinson’s disease (PD)
Cellular models are instrumental in dissecting a complex pathological process into simpler molecular events. Parkinson’s disease (PD) is multifactorial and clinically heterogeneous; the aetiology of the sporadic (and most common) form is still unclear and only a few molecular mechanisms have been clarified so far in the neurodegenerative cascade. In such a multifaceted picture, it is particularly important to identify experimental models that simplify the study of the different networks of proteins and genes involved. Cellular models that reproduce some of the features of the neurons that degenerate in PD have contributed to many advances in our comprehension of the pathogenic flow of the disease. In particular, the pivotal biochemical pathways (i.e. apoptosis and oxidative stress, mitochondrial impairment and dysfunctional mitophagy, unfolded protein stress and improper removal of misfolded proteins) have been widely explored in cell lines, challenged with toxic insults or genetically modified. The central role of a-synuclein has generated many models aiming to elucidate its contribution to the dysregulation of various cellular processes. Classical cellular models appear to be the correct choice for preliminary studies on the molecular action of new drugs or potential toxins and for understanding the role of single genetic factors. Moreover, the availability of novel cellular systems, such as cybrids or induced pluripotent stem cells, offers the chance to exploit the advantages of an in vitro investigation, although mirroring more closely the cell population being affected.
Alzheimer's diseases (AD)
Synaptic degeneration and death of nerve cells are defining features of Alzheimer’s disease (AD), the most prevalent age-related neurodegenerative disorders. In AD, neurons in the hippocampus and basal forebrain (brain regions that subserve learning and memory functions) are selectively vulnerable. Studies of postmortem brain tissue from AD people have provided evidence for increased levels of oxidative stress, mitochondrial dysfunction and impaired glucose uptake in vulnerable neuronal populations. Studies of animal and cell culture models of AD suggest that increased levels of oxidative stress (membrane lipid peroxidation, in particular) may disrupt neuronal energy metabolism and ion homeostasis, by impairing the function of membrane ion-motive ATPases, glucose and glutamate transporters. Such oxidative and metabolic compromise may thereby render neurons vulnerable to excitotoxicity and apoptosis. Recent studies suggest that AD can manifest systemic alterations in energy metabolism (e.g., increased insulin resistance and dysregulation of glucose metabolism). Emerging evidence that dietary restriction can forestall the development of AD is consistent with a major “metabolic” component to these disorders, and provides optimism that these devastating brain disorders of aging may be largely preventable.
- Nic, M.; Jirat, J.; Kosata, B., eds. (2006) Chemical Reaction. IUPAC Compendium of Chemical Terminology (Online ed.).
- March, J (1985) Advanced Organic Chemistry: Reactions, Mechanisms, and Structure (3rd ed.), New York: Wiley.
- Mishra, B. (2002) A symbolic approach to modelling cellular behaviour. In Prasanna,V., Sahni,S. and Shukla,U. (eds), High Performance Computing—HiPC 2002. LNCS 2552. Springer-Verlag, pp. 725–732.
- Ingham, P.W., Nakano, Y., Seger, C. (2011)Mechanisms and functions of Hedgehog signalling across the metazoa. Nature Reviews Genetics, 12 (6), 393–406.
- Antoniotti, M., Park, F., Policriti, A., Ugel, N., Mishra, B. (2003) Foundations of a query and simulation system for the modeling of biochemical and biological processes. In Pacific Symposium on Biocomputing 2003 (PSB 2003), pp. 116–127.
- de Jong, H.(2002) Modeling and simulation of genetic regulatory systems: a literature review. J. Comput. Biol., 9(1), 67–103.
- Hinkle JL, Bowman L (2003) Neuroprotection for ischemic stroke. J Neurosci Nurs 35 (2): 114–8.
- Viswanathan G. A., Seto J., Patil S., Nudelman G., Sealfon S. C. (2008) Getting Started in Biological Pathway Construction and Analysis. PLoS Comput Biol 4(2): e16.
- Stromback L., Jakoniene V., Tan H., Lambrix P. (2006) Representing, storing and accessing. The MIT Press.
- Brazma A., Krestyaninova M., Sarkans U. (2006) Standards for systems biology. Nat Rev Genet 7: 593–605.
- Baclawski K., Niu T. (2006) Ontologies for bioinformatics. Cambridge (Massachusetts): Boca Raton (Florida): Chapman & Hall/CRC.
- Kashtan N., Itzkovitz S., Milo R., Alon U. (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20: 1746–1758.
- Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S. (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34, D354–D357.
- Minoru K., Susumu G., Miho F., Mao T., Mika H. (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs Nucl. Acids Res. 38(1): D355-D360.
- Dahlquist K. D., Salomonis N., Vranizan K., Lawlor S. C., Conklin B. R. (2002) GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet.31(1):19-20.
- Vastrik I., D'Eustachio P., Schmidt E., Joshi-Tope G., Gopinath G., Croft D., de Bono B., Gillespie M., Jassal B., Lewis S., Matthews L., Wu G., Birney E., Stein L. (2007) Reactome: a knowledgebase of biological pathways and processes. Genome Biol. 8:R39.
- Joshi-Tope G., Gillespie M., Vastrik I., D'Eustachio P., Schmidt E., de Bono B., Jassal B., Gopinath G. R. , Wu G. R., Matthews L., Lewis S., Birney E., Stein L. (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 33:D428-32.
- Matthews, L., Gopinath, G., Gillespie, M., Caudy, M. (2009) Reactome knowledge base of human biological pathways and processes. Nucleic Acids Res. 37, D619–D622.
- Croft, D., O’Kelly, G., Wu, G., Haw, R. (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 39, D691–D697.
- Haw, R., Hermjakob, H., D'Eustachio, P. and Stein, L. (2011), Reactome pathway analysis to enrich biological discovery in proteomics data sets. Proteomics, 11: 3598–3613.
- Priami, C. (ed.) (2003) Computational Methods in Systems Biology. LNCS 2602. Springer Verlag.
- Karp, P. D., Riley, M., Saier, M., Paulsen, I. T., Paley, S. M., Pellegrini-Toole, A. (2000) The ecocyc and metacyc databases. Nucleic Acids Res., 28, 56–59.
- Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., Kanehisa, M. (1999) Kegg: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res., 27(1), 29–34.
- Ashburner, M. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet., 25, 25–29.
- Kanehisa, M. (2002) The KEGG databases at GenomeNet. Nucleic Acids Res., 30, 42–46.
- Boyle, E. I. (2004) GO::TermFinder–open source software for accessing Gene Ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics, 20, 3710–3715.
- Huang, D. W. (2007) The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol., 8,R183.
- Maere, S. (2005) BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in biological networks. Bioinformatics, 21, 3448–3449.
- Ramos, H. (2008) The protein information and property explorer: an easy-to-use, rich-client web application for the management and functional analysis of proteomic data. Bioinformatics, 24, 2110–2111.
- Li,Y. (2008) A global pathway crosstalk network. Bioinformatics, 24, 1442–1447.
- Khatri P., Sirota M., Butte A. J. (2012) Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges. PLoS Comput. Biol. 8(2): e1002375.
- Yeh C. S., Wang J. Y., Cheng T. L., Juan C. H., Wu C. H., Lin S. R. (2006) Fatty acid metabolism pathway play an important role in carcinogenesis of human colorectal cancers by Microarray-Bioinformatics analysis. Cancer letters 233 (2): 297-308.
- Alberio, T., Lopiano, L. and Fasano, M. (2012) Cellular models to investigate biochemical pathways in Parkinson’s disease. FEBS Journal, 279: 1146–1155.
- Mattson, M. P., Pedersen, W. A., Duan, W., Culmsee, C., Camandola, S. (1999) Cellular and Molecular Mechanisms Underlying Perturbed Energy Metabolism and Neuronal Degeneration in Alzheimer's and Parkinson's Diseases. Annals of the New York Academy of Sciences, 893: 154–175.