Allele frequency

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Allele frequency, or gene frequency, is the relative frequency of an allele (variant of a gene) at a particular locus in a population, expressed as a fraction or percentage.[1] Specifically, it is the fraction of all chromosomes in the population that carry that allele.

Given the following:

  1. a particular locus on a chromosome and a given allele at that locus
  2. a population of N individuals with ploidy n, i.e. an individual carries n copies of each chromosome in their somatic cells (e.g. two chromosomes in the cells of diploid species)
  3. the allele exists in i chromosomes in the population

then the allele frequency is the fraction of all the occurrences i of that allele and the total number of chromosome copies across the population, i/(nN). For a diploid population, this fraction is i/(2N).

The allele frequency is distinct from the genotype frequency, although they are related, and allele frequencies can be calculated from genotype frequencies.[1]

In population genetics, allele frequencies are used to describe the amount of variation at a particular locus or across multiple loci. When considering the ensemble of allele frequencies for a large number of distinct loci, their distribution is called the allele frequency spectrum.

Calculation of allele frequencies from genotype frequencies[edit]

The actual frequency calculations depend on the ploidy of the species for autosomal genes.


The frequency (p) of an allele A is the fraction of the number of copies (i) of the A allele and the population or sample size (N), so

p = i/N.


If f(\mathbf{AA}), f(\mathbf{AB}), and f(\mathbf{BB}) are the frequencies of the three genotypes at a locus with two alleles, then the frequency p of the A-allele and the frequency q of the B-allele in the population are obtained by counting alleles.

p=f(\mathbf{AA})+ \frac{1}{2}f(\mathbf{AB})= \mbox{frequency of A}
q=f(\mathbf{BB})+ \frac{1}{2}f(\mathbf{AB})= \mbox{frequency of B}

Because p and q are the frequncies of the only two alleles present at that locus, they must sum to 1. To check this:

q=1-p and p=1-q

If there are more than two different allelic forms, the frequency for each allele is simply the frequency of its homozygote plus half the sum of the frequencies for all the heterozygotes in which it appears. Allele frequency can always be calculated from genotype frequency, whereas the reverse requires that the Hardy–Weinberg conditions of random mating apply.


Consider a locus that carries two alleles, A and B. In a diploid population there are three possible genotypes, two homozygous genotypes (AA and BB), and one heterozygous genotype (AB). If we sample 10 individuals from the population, and we observe the genotype frequencies

  1. freq(AA) = 6
  2. freq(AB) = 3
  3. freq(BB) = 1

then there are  6\times2 + 3 = 15 observed copies of the A allele and  1\times2 + 3 = 5 of the B allele, out of 20 total chromosome copies. The frequency p of the A allele is p = 15/20 = 0.75, and the frequency q of the B allele is q = 5/20 = 0.25.

Allele frequency dynamics[edit]

Population genetics describes the genetic composition of a population, including allele frequencies, and how allele frequencies are expected to change over time. The Hardy-Weinberg law describes the expected equilibrium genotype frequencies in a diploid population after random mating. Random mating alone does not change allele frequencies, and the Hardy-Weinberg equilibrium assumes an infinite population size and a selectively neutral locus.[1]

In natural populations, natural selection, population demography and structure, and mutation combine to change allele frequencies across generations. Genetic drift causes changes in allele frequency from random sampling due to offspring number variance in a finite population size, with small populations experiencing larger per generation fluctuations in frequency than large populations. An allele at a particular locus may also confer some fitness effect for an individual carrying that allele, on which natural selection acts. Beneficial alleles tend to increase in frequency, while deleterious alleles tend to decrease in frequency. Even when an allele is selectively neutral, selection acting on nearby genes may also change its allele frequency through hitchhiking or background selection.

While heterozygosity at a given locus decreases over time as alleles become fixed or lost in the population, variation is maintained in the population through new mutations and gene flow due to migration between populations. For details, see population genetics.

See also[edit]


  1. ^ a b c Gillespie, John H. (2004). Population genetics : a concise guide (2. ed.). Baltimore, Md.: The Johns Hopkins University Press. ISBN 0801880084. 

External links[edit]

Cheung, KH; Osier MV; Kidd JR; Pakstis AJ; Miller PL; Kidd KK (2000). "ALFRED: an allele frequency database for diverse populations and DNA polymorphisms". Nucleic Acids Research 28 (1): 361–3. doi:10.1093/nar/28.1.361. PMC 102486. PMID 10592274. 

Middleton, D; Menchaca L; Rood H; Komerofsky R (2002). "New allele frequency database:". Tissue Antigens 61 (5): 403–7. doi:10.1034/j.1399-0039.2003.00062.x. PMID 12753660.