Pan-cancer analysis
Pan-cancer analysis aims to examine the similarities and differences among the genomic and cellular alterations found across diverse tumor types.[1][2] International efforts have performed pan-cancer analysis on exomes and the whole genomes of cancers, the latter including their non-coding regions. In 2018, The Cancer Genome Atlas (TCGA) Research Network used exome, transcriptome, and DNA methylome data to develop an integrated picture of commonalities, differences, and emergent themes across tumor types.
In 2020, the International Cancer Genome Consortium (ICGC)/TCGA Pan-Cancer Analysis of Whole Genomes project published a set of 24 papers analyzing whole cancer genomes and transcriptomic data from 38 tumor types. A comprehensive overview of the project is provided in its flagship paper.[3]
Another project, pan-cancer analysis of RNA-binding proteins (RBPs) across human cancers,[4] explored the expression, somatic copy number alteration, and mutation profiles of 1,542 RBPs in ~7,000 clinical specimens across 15 cancer types. This study characterized the oncogenic properties of six RBPs—NSUN6, ZC3H13, BYSL, ELAC1, RBMS3, and ZGPAT—in colorectal and liver cancer cell lines.
Several studies have found a causal, predictable connection between genomic alterations (single-nucleotide variants or large copy number variants) and gene expression across all tumor types. This pan-cancer relationship between genomic status and transcriptomic quantitative data can predict a specific genomic alteration from gene expression profiles alone;[5] it can also be used as the basis for machine learning approaches.
Pan-cancer studies
Pan-cancer studies aim to detect the genes whose mutation is conducive to oncogenesis, as well as recurrent genomic events or aberrations between different tumors. For these studies, it is necessary to standardize the data between multiple platforms, establishing criteria between different researchers to work on the data and present the results. Omics data allow the rapid identification and quantification of thousands of molecules in a single experiment. Genomics addresses the potential that certain genes will be expressed, proteomics addresses what genes are in fact being expressed, and metabolomics addresses what has happened in the tissue being studied. The combination of all of them gives information about the biological system.
Resources and databases
The nearly 800 terabytes of data from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes project have been made available through various portals and repositories, including those at the Ontario Institute for Cancer Research, the European Molecular Biology Laboratory's European Bioinformatics Institute, and the National Center for Biotechnology Information. All data obtained from the TCGA efforts are available at the US National Cancer Institute's TARGET Data Matrix and the web portal ProteinPaint.[6]
StarBase pan-cancer resources[7] were created for the networks of long noncoding RNAs, microRNAs, competing endogenous RNAs and RBPs.
External links
- Nature journals' landing page for the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes project publications
- European Molecular Biology Laboratory's landing page for the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes project
- StarBase ENCORI Pan-Cancer Analysis Platform
- US National Cancer Institute's TARGET Data Matrix
- ProteinPaint portal
References
- ^ Cancer Genome Atlas Research, Network; Weinstein, JN; Collisson, EA; Mills, GB; Shaw, KR; Ozenberger, BA; Ellrott, K; Shmulevich, I; Sander, C; Stuart, JM (Oct 2013). "The Cancer Genome Atlas Pan-Cancer analysis project". Nature Genetics. 45 (10): 1113–20. doi:10.1038/ng.2764. PMC 3919969. PMID 24071849.
- ^ Omberg, L; Ellrott, K; Yuan, Y; Kandoth, C; Wong, C; Kellen, MR; Friend, SH; Stuart, J; Liang, H; Margolin, AA (Oct 2013). "Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas". Nature Genetics. 45 (10): 1121–6. doi:10.1038/ng.2761. PMC 3950337. PMID 24071850.
- ^ The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium (5 February 2020). "Pan-cancer analysis of Whole Genomes". Nature. 578 (7793): 82–93. Bibcode:2020Natur.578...82I. doi:10.1038/s41586-020-1969-6. PMC 7025898. PMID 32025007.
- ^ Wang, ZL; Li, B; Luo, YX; Lin, Q; Liu, SR; Zhang, XQ; Zhou, H; Yang, JH; Qu, LH (2 January 2018). "Comprehensive Genomic Characterization of RNA-Binding Proteins across Human Cancers". Cell Reports. 22 (1): 286–298. doi:10.1016/j.celrep.2017.12.035. PMID 29298429.
- ^ Mercatelli, Daniele; Ray, Forest; Giorgi, Federico M. (2019). "Pan-Cancer and Single-Cell Modeling of Genomic Alterations Through Gene Expression". Frontiers in Genetics. 10: 671. doi:10.3389/fgene.2019.00671. ISSN 1664-8021. PMC 6657420. PMID 31379928.
- ^ "Exploring genomic alteration in pediatric cancer using ProteinPaint". Nature Genetics.
- ^ Li, JH; Liu, S; Zhou, H; Qu, LH; Yang, JH (January 2014). "starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data". Nucleic Acids Research. 42 (Database issue): D92-7. doi:10.1093/nar/gkt1248. PMC 3964941. PMID 24297251.