Copy-number variations (CNVs)—a form of structural variation—are alterations of the DNA of a genome that results in the cell having an abnormal or, for certain genes, a normal variation in the number of copies of one or more sections of the DNA. CNVs correspond to relatively large regions of the genome that have been deleted (fewer than the normal number) or duplicated (more than the normal number) on certain chromosomes. For example, the chromosome that normally has sections in order as
A-B-C-D might instead have sections
A-B-C-C-D (a duplication of "C") or
A-B-D (a deletion of "C").
This variation accounts for roughly 13% of human genomic DNA and each variation may range from about one kilobase (1,000 nucleotide bases) to several megabases in size. CNVs contrast with single-nucleotide polymorphisms (SNPs), which affect only one single nucleotide base.
Most CNVs are stable and heritable, so CNV between individuals is largely a product of genetic heritage, however, de novo CNVs arise through diverse mechanisms at various stages of development. Multiple homologous recombination reactions on each chromosome are required for the meiotic cell divisions that give rise to gametes, and although these events are of very high fidelity occasional mistakes are inevitable. Therefore, most CNV in the human genome likely arises through non-allelic homologous recombination events in which unmatched regions of chromosomes are mistakenly recombined during meiosis. However, two lines of evidence suggest that this is not the whole story. Firstly, various studies have revealed extensive CNV between different cells in the same individuals; these CNVs must have arisen post-fertilisation. Secondly, some complex genetic rearrangements cannot be readily reconciled with a non-allelic homologous recombination mechanism; these have been proposed to arise through rare replication defects resulting from broken DNA at one replication fork invading another fork, resulting in a template switch. This was subsequently superseded by a more general microhomology-mediated break-induced replication (MMBIR) model.
CNVs can be caused by structural rearrangements of the genome such as deletions, duplications, inversions, and translocations. Low copy repeats (LCRs), which are region-specific repeat sequences, are susceptible to such genomic rearrangements resulting in CNVs. Factors such as size, orientation, percentage similarity and the distance between the copies influence the susceptibility of LCRs to genomic rearrangement. Segmental Duplications (SDs) map near ancestral duplication sites in a phenomenon called duplication shadowing which describes the observation of a ~10 fold increased probability of duplication in regions flanking duplications versus other random regions.
CNV in short repeated DNA sequences called microsatellites can arise through additional mechanisms including replication slippage and defective mismatch repair. The resulting microsatellite instability is characteristic of some cancers and underlies a family of genetic disorders including Huntington's disease and myotonic dystrophy.
Copy number variation can be discovered by cytogenetic techniques such as fluorescent in situ hybridization, comparative genomic hybridization, array comparative genomic hybridization, and by virtual karyotyping with SNP arrays. Recent advances in DNA sequencing technology have further enabled the identification of CNVs by next-generation sequencing.
CNVs can be limited to a single gene or include a contiguous set of genes. CNVs can result in having either too many or too few of the dosage-sensitive genes, which may be responsible for a substantial amount of human phenotypic variability, complex behavioral traits and disease susceptibility.
In certain cases, such as rapidly growing Escherichia coli cells, the gene copy number can be 4-fold greater for genes located near the origin of DNA replication, rather than at the terminus of DNA replication. Elevating the gene copy number of a particular gene can increase the expression of the protein that it encodes. 
Prevalence in humans
The fact that DNA copy number variation is a widespread and common phenomenon among humans was first uncovered following the completion of the Human Genome Project. It is estimated that approximately 0.4% of the genome of unrelated people typically differ with respect to copy number. De novo CNVs have been observed between identical twins who otherwise have identical genomes.
Role in disease
Like other types of genetic variation, some CNVs have been associated with susceptibility or resistance to disease. Gene copy number can be elevated in cancer cells. For instance, the EGFR copy number can be higher than normal in non-small cell lung cancer. In addition, a higher copy number of CCL3L1 has been associated with lower susceptibility to HIV infection, and a low copy number of FCGR3B (the CD16 cell surface immunoglobulin receptor) can increase susceptibility to systemic lupus erythematosus and similar inflammatory autoimmune disorders. Copy number variation has also been associated with autism, schizophrenia, and idiopathic learning disability.
Among common functional CNVs, gene gains outnumber losses, suggesting that many of them are favored in evolution and, therefore, beneficial in some way. One example of CNV is the human salivary amylase gene (AMY1). This gene is typically present as two diploid copies in chimpanzees. Humans average over 6 copies and may have as many as 15. This is thought to be an adaptation to a high-starch diet that improves the ability to digest starchy foods.
- Comparative genomics
- Copy number analysis
- Human genome
- Molecular evolution
- Segmental duplication
- Tandem exon duplication
- Virtual Karyotype
- Pawel Stankiewicz, James R. Lupski (2010). "Structural Variation in the Human Genome and its Role in Disease". Annual Review of Medicine 61: 437–455. doi:10.1146/annurev-med-100708-204735. PMID 20059347.
- Piotrowski, A; Bruder, CE; Andersson, R; Diaz de Ståhl, T; Menzel, U; Sandgren, J; Poplawski, A; von Tell, D; Crasto, C; Bogdan, A; Bartoszewski, R; Bebok, Z; Krzyzanowski, M; Jankowski, Z; Partridge, EC; Komorowski, J; Dumanski, JP (Sep 2008). "Somatic mosaicism for copy number variation in differentiated human tissues.". Human mutation 29 (9): 1118–24. PMID 18570184.
- Abyzov, A; Mariani, J; Palejev, D; Zhang, Y; Haney, MS; Tomasini, L; Ferrandino, AF; Rosenberg Belmaker, LA; Szekely, A; Wilson, M; Kocabas, A; Calixto, NE; Grigorenko, EL; Huttner, A; Chawarska, K; Weissman, S; Urban, AE; Gerstein, M; Vaccarino, FM (Dec 20, 2012). "Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells.". Nature 492 (7429): 438–42. PMID 23160490.
- Lee JA, Carvalho CM, Lupski JR (2007). "A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders". Cell 131 (7): 1235–47. doi:10.1016/j.cell.2007.11.037. PMID 18160035. reported in "Copy Number Variation May Stem From Replication Misstep". ScienceDaily. 2008-01-04.
- Hastings PJ, Lupski JR, Rosenberg SM, Ira G (2009). "Mechanisms of change in gene copy number". Nature Reviews Genetics 10 (8): 551–564. doi:10.1038/nrg2593. PMC 2864001. PMID 19597530.
- Lee JA, Lupski JR (2006). "Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders". Neuron 52 (1): 103–121. doi:10.1016/j.neuron.2006.09.027. PMID 17015230.
- Cheng, Z.; Ventura, M.; She, X.; Khaitovich, P.; Graves, T.; Osoegawa, K.; Church, D.; Dejong, P.; Wilson, K.; Pääbo, S.; Rocchi, M.; Eichler, E. E. (Sep 2005). "A genome-wide comparison of recent chimpanzee and human segmental duplications". Nature 437 (7055): 88–93. Bibcode:2005Natur.437...88C. doi:10.1038/nature04000. ISSN 0028-0836. PMID 16136132.
- Li, YC; Korol, AB; Fahima, T; Nevo, E (Jun 2004). "Microsatellites within genes: structure, function, and evolution.". Molecular biology and evolution 21 (6): 991–1007. PMID 14963101.
- Slean, MM; Panigrahi, GB; Ranum, LP; Pearson, CE (Jul 1, 2008). "Mutagenic roles of DNA "repair" proteins in antibody diversity and disease-associated trinucleotide repeat instability.". DNA repair 7 (7): 1135–54. PMID 18485833.
- Korbel JO, et al. (2007). "Paired-end mapping reveals extensive structural variation in the human genome". Science 318 (5849): 420–426. Bibcode:2007Sci...318..420K. doi:10.1126/science.1149504. PMC 2674581. PMID 17901297.
- Sudmant PH, et al. (2010). "Diversity of human copy number variation and multicopy genes". Science 330 (6004): 641–646. Bibcode:2010Sci...330..641S. doi:10.1126/science.1197005. PMC 3020103. PMID 21030649.
- Mills RE, et al. (2011). "Mapping copy number variation by population-scale genome sequencing". Nature 470 (7332): 59–65. Bibcode:2011Natur.470...59.. doi:10.1038/nature09708. PMC 3077050. PMID 21293372.
- Paudel Y, et al. (2013). "Evolutionary dynamics of copy number variation in pig genomes in the context of adaptation and domestication". BMC genomics 14 (1): 449. doi:10.1186/1471-2164-14-449. PMC 3716681. PMID 23829399.
- Redon J, et al. (2006). "Global variation in copy number in the human genome". Nature 444 (7118): 444–454. Bibcode:2006Natur.444..444R. doi:10.1038/nature05329. PMC 2669898. PMID 17122850.
- Freeman JL, et al. (2006). "Copy number variation: New insights into genome diversity". Genome Research 16 (8): 949–61. doi:10.1101/gr.3677206. PMID 16809666.
- Atkinson M, Savageau M, Myers JT, Ninfa A (2003). "Development of Genetic Circuitry Exhibiting Toggle Switch Behavior in Escherichia Coli". Cell 113 (5): 597–607. doi:10.1016/S0092-8674(03)00346-5. PMID 12787501.
- Perry GH, et al. (2007). "Diet and evolution of human amylase gene copy number variation". Nature Genetics 39 (10): 1256–60. doi:10.1038/ng2123. PMC 2377015. PMID 17828263.
- Sebat J, et al. (2004). "Large-scale copy number polymorphism in the human genome". Science 305 (5683): 525–528. Bibcode:2004Sci...305..525S. doi:10.1126/science.1098918. PMID 15273396.
- Iafrate A, et al. (2004). "Detection of large-scale variation in the human genome". Nature Genetics 36 (9): 949–51. doi:10.1038/ng1416. PMID 15286789.
- Kidd JM, Cooper GM, Donahue WF, et al. (May 2008). "Mapping and sequencing of structural variation from eight human genomes". Nature 453 (7191): 56–64. Bibcode:2008Natur.453...56K. doi:10.1038/nature06862. PMC 2424287. PMID 18451855.
- "Human Genetic Variation Fact Sheet". National Institute of General Medical Sciences (NIH). July 2008. Retrieved 2008-08-16.
- Cappuzzo F, Hirsch, et al. (2005). "Epidermal growth factor receptor gene and protein and gefitinib sensitivity in non-small-cell lung cancer". Journal of the National Cancer Institute 97 (9): 643–655. doi:10.1093/jnci/dji112. PMID 15870435.
- Gonzalez E, et al. (2005). "The Influence of CCL3L1 Gene-Containing Segmental Duplications on HIV-1/AIDS Susceptibility". Science 307 (5714): 1434–1440. Bibcode:2005Sci...307.1434G. doi:10.1126/science.1101160. PMID 15637236.
- Aitman TJ, et al. (2006). "Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans". Nature 439 (7078): 851–855. Bibcode:2006Natur.439..851A. doi:10.1038/nature04489. PMID 16482158.
- Cook EH, Scherer SW (2008). "Copy-number variations associated with neuropsychiatric conditions". Nature 455 (7215): 919–23. Bibcode:2008Natur.455..919C. doi:10.1038/nature07458. PMID 18923514.
- Pinto J, et al. (2010). "Functional impact of global rare copy number variation in autism spectrum disorders". Nature 466 (7304): 368–72. Bibcode:2010Natur.466..368P. doi:10.1038/nature09146. PMC 3021798. PMID 20531469.
- Sebat J, et al. (2007). "Strong association of de novo copy number mutations with autism". Science 316 (5823): 445–9. Bibcode:2007Sci...316..445S. doi:10.1126/science.1138659. PMC 2993504. PMID 17363630.
- Gai X, et al. (2011). "Rare structural variation of synapse and neurotransmission genes in autism". Mol Psychiatry 17 (4): 402–11. doi:10.1038/mp.2011.10. PMC 3314176. PMID 21358714.
- St Clair D (2008). "Copy number variation and schizophrenia". Schizophr Bull 35 (1): 9–12. doi:10.1093/schbul/sbn147. PMC 2643970. PMID 18990708.
- Knight S, et al. (1999). "Subtle chromosomal rearrangements in children with unexplained mental retardation". The Lancet 354 (9191): 1676–81. doi:10.1016/S0140-6736(99)03070-6. PMID 10568569.
- Craddock N, Hurles ME, Cardin N, et al. (April 2010). "Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls". Nature 464 (7289): 713–20. Bibcode:2010Natur.464..713T. doi:10.1038/nature08979. PMC 2892339. PMID 20360734.
- Feng Zhang, Wenli Gu, Matthew E. Hurles, and James R. Lupski (2009). "Copy Number Variation in Human Health, Disease, and Evolution". Annu. Rev. Genomics Hum. Genet.
- Genome-wide analysis of DNA copy-number changes using cDNA microarrays "Gene amplifications and deletions frequently contribute to tumorigenesis. Characterization of these DNA copy-number changes is important for both the basic understanding of cancer and its diagnosis".
- "Huge genetic variation in healthy people". New Scientist. 7 August 2004.
- "As normal as normal can be". Nature Genetics. 1 September 2004.
- "Human Genome: Patchwork people". Nature. 20 October 2005.
- "Gene duplications may define who you are". New Scientist. 22 November 2006.
- "DNA varies more widely from person to person, Genetic maps reveal". National Geographic. 22 November 2006.
- "Finding the right lenses" (PDF). Nature Genetics. 1 July 2007.
- "Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library" (PDF). Nature Biotechnoloty. 1 January 2010.
- "New Research Sheds Light on Autism's Genetic Causes". Singularity Hub. 15 June 2010.
- Copy Number Variation Project, Sanger Institute
- Integrative annotation platform for copy number variations in humans
- A bibliography on copy number variation
- Database of Genomic Variants, a database of structural variants in the human genome
- Copy Number Variation Detection via High-Density SNP Genotyping
- Oxford Gene Technology
- BioDiscovery Nexus Copy Number
- High-resolution mapping of copy number variations in 2,026 healthy individuals
- The 1000 Genomes Project
- cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate, an R package —software