The term C-value refers to the amount, in picograms, of DNA contained within a haploid nucleus (e.g. a gamete) or one half the amount in a diploid somatic cell of a eukaryotic organism. In some cases (notably among diploid organisms), the terms C-value and genome size are used interchangeably, however in polyploids the C-value may represent two or more genomes contained within the same nucleus. Greilhuber et al. have suggested some new layers of terminology and associated abbreviations to clarify this issue, but these somewhat complex additions have yet to be used by other authors.
Origin of the term
Many authors have incorrectly assumed that the "C" in "C-value" refers to "characteristic", "content", or "complement". Even among authors who have attempted to trace the origin of the term, there had been some confusion because Hewson Swift did not define it explicitly when he coined it in 1950. In his original paper, Swift appeared to use the designation "1C value", "2C value", etc., in reference to "classes" of DNA content (e.g., Gregory 2001, 2002); however, Swift explained in personal correspondence to Prof. Michael D. Bennett in 1975 that "I am afraid the letter C stood for nothing more glamorous than 'constant', i.e., the amount of DNA that was characteristic of a particular genotype" (quoted in Bennett and Leitch 2005). This is in reference to the report in 1948 by Vendrely and Vendrely of a "remarkable constancy in the nuclear DNA content of all the cells in all the individuals within a given animal species" (translated from the original French). Swift's study of this topic related specifically to variation (or lack thereof) among chromosome sets in different cell types within individuals, but his notation evolved into "C-value" in reference to the haploid DNA content of individual species and retains this usage today.
Variation among species
C-values vary enormously among species. In animals they range more than 3,300-fold, and in land plants they differ by a factor of about 1,000. Protist genomes have been reported to vary more than 300,000-fold in size, but the high end of this range (Amoeba) has been called into question. Variation in C-values bears no relationship to the complexity of the organism or the number of genes contained in its genome, an observation that was deemed wholly counterintuitive before the discovery of non-coding DNA and which became known as the C-value paradox as a result. However, although there is no longer any paradoxical aspect to the discrepancy between C-value and gene number, this term remains in common usage. For reasons of conceptual clarification, the various puzzles that remain with regard to genome size variation instead have been suggested to more accurately comprise a complex but clearly defined puzzle known as the C-value enigma. C-values correlate with a range of features at the cell and organism levels, including cell size, cell division rate, and, depending on the taxon, body size, metabolic rate, developmental rate, organ complexity, geographical distribution, or extinction risk (for recent reviews, see Bennett and Leitch 2005; Gregory 2005).
|Nucleotide||Chemical formula||Relative molecular mass (Da)|
†Source of table: Doležel et al., 2003
By using the data in Table 1, relative masses of nucleotide pairs can be calculated as follows: A/T = 615.383 and G/C = 616.3711, bearing in mind that formation of one phosphodiester linkage involves a loss of one H2O molecule. Further, phosphates of nucleotides in the DNA chain are acidic so at physiologic pH the H+ ion is dissociated. Provided the ratio of A/T to G/C pairs is 1:1 (the GC-content is 50%), the mean relative mass of one nucleotide pair is 615.8771.
The relative molecular mass may be converted to an absolute value by multiplying it by the atomic mass unit (1 u), 1.660539 × 10−27 kg. Consequently, the mean mass per nucleotide pair would be 1.023× 10−9 pg/bp, and average DNA genome density would be 978 Mb/pg.
No species has a GC-content of exactly 50% (equal amounts of A/T and G/C nucleotide bases) as assumed by Doležel et al. However, as a G/C pair is only heavier than an A/T pair by about 1/6 of 1%, the effect of variations in GC content is small. The actual GC content varies between species, between chromosomes, and between isochores (sections of a chromosome with like GC content). Adjusting Doležel's calculation for GC content, the theoretical variation in base pairs per picogram ranges from 977.0317 Mb/pg for 100% GC content to 978.6005 Mb/pg for 0% GC content (A/T being lighter, has more Mb/pg), with a midpoint of 977.8155 Mb/pg for 50% GC content.
The Human genome varies in size; however, the current estimate of the nuclear haploid size of the reference human genome is 3,031,042,417 bp for the X gamete and 2,932,228,937 bp for the Y gamete. (The X gamete size is larger because the X chromosome is larger than the Y. Added together, the XX female diploid genome size is larger than the XY male.)
Summarizing these numbers:
|Cell||Chromosomes Description||Type||Ploidy||Base Pairs (bp)||GC Content (%)||Density (Mb/pg)||Mass (pg)||C-Value|
|Sperm or egg||23 heterologous chromosomes||X Gamete||Haploid||3,031,042,417||40.97460%||977.9571||3.099361||3.099361|
|Sperm only||23 heterologous chromosomes||Y Gamete||Haploid||2,932,228,937||41.01724%||977.9564||2.998323||2.998323|
|Zygote||46 chromosomes consisting of 2 homologous groups of 23 heterologous chromosomes each||XX Female||Diploid||6,062,084,834||40.97460%||977.9571||6.198723||3.099361|
|Zygote||46 chromosomes consisting of 2 homologous groups of 22 heterologous chromosomes each plus 2 heterologous chromosomes||XY Male||Mostly diploid||5,963,271,354||40.99557%||977.9567||6.097684||3.157877|
(The male C-value shown takes into account the 24 heterologous chromosomes.)
- Greilhuber J, Doležel J, Lysák M, Bennett MD (2005). "The origin, evolution and proposed stabilization of the terms 'genome size' and 'C-value' to describe nuclear DNA contents". Annals of Botany 95 (1): 255–60. doi:10.1093/aob/mci019. PMID 15596473.
- Swift H (1950). "The constancy of deoxyribose nucleic acid in plant nuclei". Proceedings of the National Academy of Sciences of the USA 36 (11): 643–654. doi:10.1073/pnas.36.11.643. PMC 1063260. PMID 14808154.
- Gregory TR (2001). "Coincidence, coevolution, or causation? DNA content, cell size, and the C-value enigma". Biological Reviews 76 (1): 65–101. doi:10.1017/S1464793100005595. PMID 11325054.
- Gregory TR (2002). "A bird's-eye view of the C-value enigma: genome size, cell size, and metabolic rate in the class Aves". Evolution 56 (1): 121–30. doi:10.1111/j.0014-3820.2002.tb00854.x. PMID 11913657.
- Bennett MD, Leitch IJ (2005). "Genome size evolution in plants". In T.R. Gregory. The Evolution of the Genome. San Diego: Elsevier. pp. 89–162.
- Vendrely R, Vendrely C; Vendrely (1948). "La teneur du noyau cellulaire en acide désoxyribonucléique à travers les organes, les individus et les espèces animales : Techniques et premiers résultats". Experientia (in French) 4 (11): 434–436. doi:10.1007/bf02144998. PMID 18098821.
- Gregory TR (2005). "Genome size evolution in animals". In T.R. Gregory. The Evolution of the Genome. San Diego: Elsevier. pp. 3–87.
- Doležel J, Bartoš J,Voglmayr H, Greilhuber J (2003). "Letter to the editor: Nuclear DNA Content and Genome Size of Trout and Human". Cytometry 51A (2): 127–128. doi:10.1002/cyto.a.10013. PMID 12541287.
- Lander, ES; Linton, LM; Birren, B; Nusbaum, C; Zody, MC; Baldwin, J; Devon, K; Dewar, K et al. (2001). "International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome". Nature 409 (6822): 860–921. doi:10.1038/35057062. PMID 11237011.
- "Assembly Statistics for GRCh38.p2". Genome Reference Consortium. 8 December 2014. Retrieved 8 February 2015.
- Stylianos E. Antonarakis (2010). Vogel and Motulsky’s Human Genetics: Problems and Approaches. Berlin Heidelberg: Springer-Verlag. p. 32. ISBN 978-3-540-37654-5. Retrieved 8 February 2015.
- Kokocinski, Felix. "Bioinformatics work notes". GC content of human chromosomes. Retrieved 8 February 2015.