In adult somatic tissues, cytosine residues may be methylated, and this occurs almost exclusively within a symmetric CpG context. Methylated C residues spontaneously deaminate to form T residues; hence CpG dinucleotides steadily mutate to TpG dinucleotides, which gives rise to the under-representation of CpG dinucleotides in the human genome (they occur at only 21% of the expected frequency). (On the other hand, spontaneous deamination of unmethylated C residues gives rise to U residues, a mutation that is quickly recognized and repaired by the cell).
In human and mouse, CGs are the least frequent dinucleotide, making up less than 1% of all dinucleotides. GCs are the second most infrequent, making up more than 4% of all dinucleotides, so CGs are more than fourfold less frequent than all other dinucleotides.
- Law J, Jacobsen SE (2010). "Establishing, maintaining and modifying DNA methylation patterns in plants and animals". Nat. Rev. Genet. 11 (3): 204–220. doi:10.1038/nrg2719. PMC . PMID 20142834.
- International Human Genome Sequencing Consortium; et al. (February 2001). "Initial sequencing and analysis of the human genome". Nature. 409 (6822): 860–921. doi:10.1038/35057062. PMID 11237011.