Glossary of genetics

From Wikipedia, the free encyclopedia
  (Redirected from MINSEQE)
Jump to navigation Jump to search

This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology and evolutionary biology.[1] It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term.


A ribose ring with the carbon atoms numbered 1' through 5' according to chemical convention. The 5' carbon is said to be upstream; the 3' carbon is said to be downstream. Bonds to a generic base and a phosphate group are also shown.

Also rendered as three-prime end.

The end of a single strand of DNA or RNA at which the chain of nucleotides terminates at the third carbon atom in the furanose ring of deoxyribose or ribose (i.e. the terminus at which the 3' carbon is not attached to another nucleotide via a phosphodiester bond; in vivo, the 3' carbon is often still bonded to a hydroxyl group). By convention, sequences and structures positioned nearer to the 3'-end relative to others are referred to as downstream. Contrast 5'-end.
5' cap

Also rendered as five-prime cap.

A specially altered nucleotide attached to the 5'-end of some primary RNA transcripts as part of the set of post-transcriptional modifications which convert raw transcripts into mature RNA products. The precise structure of the 5' cap varies widely by organism; in eukaryotes, the most basic cap consists of a 7-methylguanosine (a methylated guanine nucleoside) bonded to the triphosphate group that terminates the 5'-end of an RNA sequence. Among other functions, capping helps to regulate the export of mature RNAs from the nucleus, prevent their degradation by exonucleases, and promote translation in the cytoplasm. Mature mRNAs can also be decapped.

Also rendered as five-prime end.

The end of a single strand of DNA or RNA at which the chain of nucleotides terminates at the fifth carbon atom in the furanose ring of deoxyribose or ribose (i.e. the terminus at which the 5' carbon is not attached to another nucleotide via a phosphodiester bond; in vivo, the 5' carbon is often still bonded to a phosphate group). By convention, sequences and structures positioned nearer to the 5'-end relative to others are referred to as upstream. Contrast 3'-end.


A type of transcription factor that increases the transcription of a gene or set of genes. Most activators work by binding to a specific sequence located within or near an enhancer or promoter and facilitating the binding of RNA polymerase and other transcription machinery in the same region. See also coactivator; contrast repressor.

Abbreviated in shorthand with the letter A.

One of the four main nucleobases present in DNA and RNA. Adenine forms a base pair with thymine in DNA and with uracil in RNA.
affected relative pair
Any pair of organisms which are related genetically and both affected by the same trait. For example, two cousins who both have blue eyes are an affected relative pair since they are both affected by the allele that codes for blue eyes.
One of multiple alternative versions of an individual gene, each of which is a viable DNA sequence occupying a given position, or locus, on a chromosome. For example, in humans, one allele of the eye-color gene produces blue eyes and another allele of the eye-color gene produces brown eyes.
allele frequency
The relative frequency with which a particular allele of a given gene (as opposed to other alleles of the same gene) occurs at a particular locus in the members of a population; more specifically, it is the proportion of all chromosomes within a population that carry a particular allele, expressed as a fraction or percentage. Allele frequency is distinct from genotype frequency, although they are related.

Also called a sex chromosome, heterochromosome, or idiochromosome.

Any chromosome that differs from an ordinary autosome in size, form, or behavior and which is responsible for determining the sex of an organism. In humans, the X chromosome and the Y chromosome are sex chromosomes.
alternative splicing

Also called differential splicing or simply splicing.

A regulated phenomenon of eukaryotic gene expression in which specific exons or parts of exons from the same primary transcript are variably included within or removed from the final, mature messenger RNA transcript. A class of post-transcriptional modification, alternative splicing allows a single gene to code for multiple protein isoforms and greatly increases the diversity of proteins that can be produced by an individual genome. See also RNA splicing.
amino acid
An organic compound containing amine and carboxyl functional groups, as well as a side chain specific to each individual amino acid. Out of nearly 500 known amino acids, a set of 20 are coded for by the standard genetic code and incorporated in sequence as the building blocks of polypeptides and hence of proteins. The specific sequence of amino acids in the polypeptide chains that form a protein are ultimately responsible for determining the protein's structure and function.
The stage of mitosis that occurs after metaphase and before telophase when the replicated chromosomes are segregated and each of the sister chromatids are moved to opposite sides of the cell.
The condition of a cell or organism having an abnormal number of one or more specific individual chromosomes (but excluding abnormal numbers of complete sets of chromosomes, which instead is known as euploidy).
A phenomenon by which the symptoms of a genetic disorder become apparent (and often more severe) at an earlier age in affected individuals with each generation that inherits the disorder.
A series of three consecutive nucleotides within a transfer RNA which complement the three nucleotides of a codon within an mRNA transcript. During translation, each tRNA recruited to the ribosome contains a single anticodon triplet that pairs with one or more complementary codons from the mRNA sequence, allowing each codon to specify a particular amino acid to be added to the growing peptide chain. Anticodons containing inosine in the first position are capable of pairing with more than one codon due to a phenomenon known as wobble base pairing.
The orientation of two strands of a double-stranded nucleic acid (and more generally any pair of biopolymers) which are parallel to each other but with opposite directionality. For example, the two complementary strands of a DNA molecule run side-by-side but in opposite directions, with one strand oriented 5'-to-3' and the other 3'-to-5'.
Any chromosome that is not an allosome and hence is not involved in the determination of the sex of an organism. Unlike the sex chromosomes, the autosomes in a diploid cell exist in pairs, with the members of each pair having the same structure, morphology, and genetic loci.


The breeding of a hybrid organism with one of its parents or an individual genetically similar to one of its parents, often intentionally as a type of selective breeding, with the aim of producing offspring with a genetic identity which is closer to that of the parent. The reproductive event and the resulting progeny are both referred to as a backcross and often abbreviated in genetics shorthand with the symbol BC.
bacterial artificial chromosome (BAC)
base pair (bp)
A pair of two nucleobases on complementary DNA or RNA strands which are bound to each other by hydrogen bonds. The ability of consecutive base pairs to stack one upon another contributes to the long-chain double helix structures observed in both double-stranded DNA and double-stranded RNA molecules.
A measure of the gene expression level of a gene or genes prior to a perturbation in an experiment, as in a negative control. Baseline expression may also refer to the expected or historical measure of expression for a gene.
blunt end


The ability of a population to consistently produce the same phenotype regardless of the variability of its environment or the genetic variation within its genome. The concept is most often used in developmental biology to interpret the observation that developmental pathways are frequently shaped by natural selection such that developing cell lineages are "guided" or "canalized" towards a single, definite fate, regardless of any minor variations that may disturb the cells during development.
candidate gene
A gene whose location on a chromosome is associated with a particular phenotype (often a disease-related phenotype), and which is therefore suspected of causing or contributing to the phenotype. Candidate genes are often selected for study based on a priori knowledge or speculation about their functional relevance to the trait or disease being researched.
An individual who has inherited a recessive allele for a genetic trait or mutation but in whom the trait is not usually expressed or observable in the phenotype. Carriers are usually heterozygous for the recessive allele and therefore still able to pass the allele onto their offspring, where the associated phenotype may reappear if the offspring inherits another copy of the allele. The term is commonly used in medical genetics in the context of a disease-causing recessive allele.

Also abbreviated as CAAT box or CAT box.

cellular reprogramming
The conversion of a cell from one tissue-specific cell type to another. This involves dedifferentiation to a pluripotent state; an example is the conversion of mouse somatic cells to an undifferentiated embryonic state, which relies on the transcription factors Oct4, Sox2, Myc, and Klf4.[2]
centimorgan (cM)

Also called a map unit (m.u.).

A unit for measuring genetic linkage defined as the distance between chromosomal loci for which the expected average number of intervening chromosomal crossovers in a single generation is 0.01. Though it is not an actual measure of physical distance, it is used to infer the distance between two loci based on the apparent likelihood of a crossover occurring between them.
The part of a chromosome that links a pair of sister chromatids. During mitosis, spindle fibers attach to the centromere via kinetochores.
One copy of a newly copied chromosome, which is joined to the original chromosome by a centromere.
A complex of DNA, RNA, and protein found in eukaryotic cells that is the primary substance comprising chromosomes. Chromatin functions as a means of packaging very long DNA molecules into highly organized and densely compacted shapes, which prevents the strands from becoming tangled, reinforces the DNA during cell division, helps to prevent DNA damage, and plays an important role in regulating gene expression and DNA replication.
chromosomal crossover

Also called crossing over.

chromosomal duplication
A DNA molecule containing part or all of the genetic material of an organism. Chromosomes may be considered a sort of molecular "package" for carrying DNA within the nucleus of cells and, in most eukaryotes, are composed of long strands of DNA coiled with packaging proteins which bind to and condense the strands to prevent them from becoming an unmanageable tangle. Chromosomes are most easily distinguished and studied in their completely condensed forms, which only occur during cell division. Some simple organisms have only one chromosome made of circular DNA, while most eukaryotes have multiple chromosomes made of linear DNA.
cis-dominant mutation
A mutation occurring within a cis-regulatory element (such as an operator) which alters the functioning of a nearby gene or genes on the same strand of DNA. Cis-dominant mutations can affect the expression of genes because they occur at sites that control the transcription of the genes rather than within the genes themselves.
cis-regulatory element (CRE)
classical genetics
The branch of genetics based solely on observation of the visible results of reproductive acts, as opposed to that made possible by the modern techniques and methodologies of molecular biology. Contrast molecular genetics.
The process of producing, either naturally or artificially, individual organisms or cells which are genetically identical to each other. Clones are the result of all forms of asexual reproduction, and cells that undergo mitosis produce daughter cells that are clones of the parent cell and of each other. Cloning may also refer to biotechnology methods which artificially create copies of organisms or cells, or, in molecular cloning, copies of DNA fragments or other molecules.
A type of coregulator that increases the expression of one or more genes by binding to an activator.
A series of three consecutive nucleotides in a coding region of a nucleid acid sequence, which codes for a particular amino acid or stop signal during protein synthesis. DNA and RNA molecules are each written in a language using four "letters" (four different nucleobases), but the language used to construct proteins includes 20 "letters" (20 different amino acids). Codons provide the key that allows these two languages to be translated into each other. In general, each codon corresponds to a single amino acid (or stop signal), and the full set of codons is called the genetic code.
codon usage bias
Any non-protein organic compound that is bound to an enzyme. Cofactors are required for the initiation of catalysis.
comparative genomic hybridization (CGH)
complementary DNA (cDNA)
complex trait
See quantitative trait.
conditional expression
The controlled, inducible expression of a transgene, either in vitro or in vivo.
consensus sequence

Also called a canonical sequence.

A calculated order of the most frequent residues (of either nucleotides or amino acids) found at each position in a common sequence alignment and obtained by comparing multiple closely related sequence alignments.
conservation genetics
An interdisciplinary branch of population genetics that applies genetic methods and concepts in an effort to understand the dynamics of genes in populations principally to avoid extinctions and to conserve and restore biodiversity.
conserved sequence
A nucleic acid or protein sequence that is highly similar or identical across many species or within a genome, indicating that it has remained relatively unchanged through a long period of evolutionary time.
constitutive expression
The continuous transcription of a gene, as opposed to facultative expression, in which a gene is only transcribed as needed. A gene that is transcribed continuously is called a constitutive gene.
copy-number variation (CNV)
A phenomenon in which sections of a genome are repeated and the number of repeats varies between individuals in the population, usually as a result of duplication or deletion events that affect entire genes or sections of chromosomes. Copy-number variations play an important role in generating genetic variation within a population.
A protein that works together with one or more transcription factors to regulate gene expression.
A type of coregulator that reduces (represses) the expression of one or more genes by binding to and activating a repressor.
CRISPR gene editing
The branch of genetics that studies how chromosomes influence and relate to cell behavior and function, particularly during mitosis and meiosis.

Abbreviated in shorthand with the letter C.

One of the four main nucleobases present in DNA and RNA. Cytosine forms a base pair with guanine.


The redundancy of the genetic code, exhibited as the multiplicity of different codons that can specify the same amino acid. For example, in the standard genetic code, the amino acid serine is specified by six unique codons (UCA, UCG, UCC, UCU, AGU, and AGC). Codon degeneracy accounts for the existence of synonymous mutations.

Denoted in shorthand with the symbol Δ.

A type of mutation in which one or more bases are removed from a nucleic acid sequence.
deoxyribonucleic acid (DNA)
A polymeric nucleic acid molecule composed of a series of deoxyribonucleotides which incorporate a set of four nucleobases: adenine (A), guanine (G), cytosine (C), and thymine (T). DNA is most often found in the form of a "double helix", which consists of two paired complementary DNA molecules and resembles a ladder that has been twisted. The "rungs" of the ladder are made of pairs of nucleobases.

Denoted in shorthand with the somatic number 2n.

(of a cell or organism) Having two homologous copies of each chromosome. Contrast haploid and polyploid.
distance measure
Any quantity used to measure the dissimilarity between the gene expression levels of different genes.[3]
DNA condensation
DNA fingerprinting
DNA microarray
A high-throughput technology used to measure expression levels of mRNA transcripts or to detect certain changes in the nucleotide sequence. It consists of an array of thousands of microscopic spots of DNA oligonucleotides, called features, each containing picomoles of a specific DNA sequence. This can be a short section of a gene or other DNA element that is used as a probe to hybridize a cDNA, cRNA or genomic DNA sample (called a target) under high-stringency conditions. Probe-target hybridization is usually detected and quantified by fluorescence-based detection of fluorophore-labeled targets.
DNA polymerase
One of a class of enzymes that synthesizes DNA molecules from individual deoxyribonucleotides. DNA polymerases are essential for DNA replication and usually work in pairs to create identical copies of the two strands of an original double-stranded molecule. They build long chains of DNA by adding nucleotides one at a time to the 3'-end of a DNA strand, usually relying on the template provided by the complementary strand to copy the nucleotide sequence faithfully.
DNA repair
The collection of processes by which a cell identifies and corrects structural damage or mutations in the DNA molecules that encode its genome. The ability of a cell to repair its DNA is vital to the integrity of the genome and the normal functionality of the organism.
DNA replication
The process by which a DNA molecule copies itself, producing two identical copies of one original DNA molecule.
DNA sequencing
The process of determining, by any of a variety of different methods and technologies, the order of the bases in the long chain of nucleotides that constitutes a sequence of DNA.
A relationship between the alleles of a gene in which one allele produces an effect on phenotype that overpowers or "masks" the contribution of another allele at the same locus; the first allele and its associated phenotypic trait are said to be dominant, and the second allele and its associated trait are said to be recessive. Often, the dominant allele codes for a functional protein while its recessive counterpart does not. Dominance is not an inherent property of any allele or phenotype, but simply describes its relationship to one or more other alleles or phenotypes; it is possible for one allele to be simultaneously dominant over a second allele, recessive to a third, and codominant to a fourth. In genetics shorthand, dominant alleles are often represented by a single uppercase letter (e.g. "A", in contrast to the recessive "a").
dosage compensation
Any mechanism by which organisms neutralize the large difference in gene dosage caused by the presence of differing numbers of sex chromosomes in the different sexes, thereby equalizing the expression of sex-linked genes so that the members of each sex receive the same or similar amounts of the products of such genes. An example is X-inactivation in female mammals.
double helix
double-stranded DNA (dsDNA)

Also called repression or suppression.

Any process, natural or artificial, which decreases the rate of gene expression of a certain gene. A gene which is observed to have lower expression (such as by detecting lower levels of its mRNA transcripts) in one sample than in another sample (often a control) is said to be downregulated. Contrast upregulation.
Towards or closer to the 3'-end of a chain of nucleotides. Contrast upstream.


ecological genetics
The study of genetics as it pertains to the ecology and fitness of natural populations of living organisms.
The quality of genetic traits that results from a specific configuration of interacting genes, rather than simply their combination.
Any enzyme whose activity is to cleave phosphodiester bonds within a chain of nucleotides, including those that cleave relatively nonspecifically (without regard to sequence) and those that cleave only at very specific sequences (so-called restriction endonucleases). When recognition of a specific sequence is required, endonucleases make their cuts in the middle of the sequence. Contrast exonuclease.
endoplasmic reticulum (ER)
A region of DNA near a gene that can be bound by an activator to increase gene expression or by a repressor to decrease expression.
1.  Another name for a plasmid, especially one that is capable of integrating into a chromosome.
2.  In eukaryotes, any non-integrated extrachromosomal circular DNA molecule that is stably maintained and replicated in the nucleus simultaneously with the rest of the host cell. Such molecules may include viral genomes, bacterial plasmids, and aberrant chromosomal fragments.
The collective action of multiple genes interacting during gene expression. A form of gene action, epistasis can be either additive or multiplicative in its effects on specific phenotypic traits.
The condition of a cell or organism having an abnormal number of complete sets of chromosomes, possibly excluding the sex chromosomes. Euploidy differs from aneuploidy, in which a cell or organism has an abnormal number of one or more specific individual chromosomes.
The change in the heritable characteristics of biological populations over successive generations. In the most traditional sense, it occurs by changes in the frequencies of alleles in a population's gene pool.
Any part of a gene that encodes a part of the final mature mRNA produced by that gene after introns have been removed by alternative splicing. The term refers to both the sequence as it exists within a DNA molecule and to the corresponding sequence in RNA transcripts.
exon skipping
Any enzyme whose activity is to cleave phosphodiester bonds within a chain of nucleotides, including those that cleave only upon recognition of a specific sequence (so-called restriction exonucleases). Exonucleases make their cuts at either the 3' or 5'-end of the sequence (rather than in the middle, as with endonucleases).
exosome complex
expression vector

Also called an expression construct or simply a vector.

For a given genotype associated with a variable non-binary phenotype, the proportion of individuals with that genotype who show or express the phenotype to a specified extent, usually given as a percentage. Because of the many complex interactions that govern gene expression, the same allele may produce a wide variety of possible phenotypes of differing qualities or degrees in different individuals; in such cases, both the phenotype and genotype may be said to show variable expressivity. Expressivity attempts to quantify the range of possible levels of phenotypic variation in a population of individuals expressing the phenotype of interest. Compare penetrance.
extrachromosomal DNA

Also called extranuclear DNA or cytoplasmic DNA.

Any DNA that is not found in chromosomes or in the nucleus of a cell and hence is not genomic DNA. This may include the DNA contained in plasmids or organelles such as mitochondria or chloroplasts, or, in the broadest sense, DNA introduced by viral infection. Extrachromosomal DNA usually shows significant structural differences from nuclear DNA in the same organism.


facultative expression
The transcription of a gene only as needed, as opposed to constitutive expression, in which a gene is transcribed continuously. A gene that is transcribed as needed is called a facultative gene.
fluorescence in situ hybridization (FISH)
The process by which a single allele for a particular gene with multiple different alleles increases in frequency in a given population such that it becomes permanently established at 100% frequency – that is, the only allele at that locus within the population's gene pool. In the absence of mutation and heterozygote advantage, any given allele is eventually destined to become either permanently fixed over all other variants or completely lost from the population, though how long this takes depends on selection pressures and chance fluctuations in allele frequencies.
frameshift mutation
A type of mutation in a nucleic acid sequence caused by the insertion or deletion of a number of nucleotides that is not divisible by three. Because of the triplet nature by which nucleotides code for amino acids, a mutation of this sort causes a shift in the reading frame of the nucleotide sequence, resulting in the sequence of codons downstream of the mutation site being completely different from the original.
Functional Genomics Data (FGED) Society

Formerly known by the abbreviation MGED.

An organization that works with others "to develop standards for biological research data quality, annotation and exchange" as well as software tools that facilitate their use.[4]


G banding

Also Giemsa banding or G-banding.

A technique used in cytogenetics to produce a visible karyotype by staining the condensed chromosomes with Giemsa stain. The staining produces consistent and identifiable patterns of dark and light "bands" in regions of chromatin, which allows specific chromosomes to be easily distinguished.
Any segment or set of segments of a nucleic acid molecule that contains the information necessary to produce a functional RNA transcript in a controlled manner. In living organisms, genes are often considered the fundamental units of heredity and are typically encoded in DNA. A particular gene can have multiple different versions, or alleles, and a single gene can result in a gene product that influences many different phenotypes.
gene dosage
The number of copies of a particular gene present in a genome. Gene dosage directly influences the amount of gene product a cell is able to express, though a variety of controls have evolved which tightly regulate gene expression. Changes in gene dosage caused by mutations include copy-number variations.
gene drive
gene duplication

Also called gene amplification.

A type of mutation defined as any duplication of a region of DNA that contains a gene. Compare chromosomal duplication.
gene expression
The process by which the information encoded in a gene is converted into a form useful for the cell. The first step is transcription, which produces a messenger RNA molecule complementary to the DNA molecule in which the gene is encoded. For protein-coding genes, the second step is translation, in which the messenger RNA is read by the ribosome to produce a protein.
Gene Expression Omnibus (GEO)
A database for gene expression managed by the National Center for Biotechnology Information.
gene mapping
gene pool
The sum of all of the various alleles shared by the members of a single population.
gene product
Any of the biochemical material resulting from the expression of a gene, most often interpreted as the functional mRNA transcript produced by transcription of the gene or the fully constructed protein produced by translation of the transcript. A measurement of the quantity of a given gene product that is detectable in a cell or tissue is sometimes used to infer how active the corresponding gene is.
gene regulation
The broad range of mechanisms used by cells to increase or decrease the production or expression of specific gene products, such as RNA or proteins. Gene regulation increases an organism's versatility and adaptability by allowing its cells to express different gene products when required by changes in its environment. In multicellular organisms, the regulation of gene expression also drives cellular differentiation and morphogenesis in the embryo, enabling the creation of a diverse array of cell types from the same genome.
gene silencing
gene therapy
gene trapping
A high-throughput technology used to simultaneously inactivate, identify, and report the expression of a target gene in a mammalian genome by introducing an insertional mutation consisting of a promoterless reporter gene and/or a selectable genetic marker flanked by an upstream splice site and a downstream polyadenylated termination sequence.
genetic association
The co-occurrence within a population of one or more alleles or genotypes with a particular phenotypic trait more often than might be expected by chance alone; such statistical correlation may be used to infer that the alleles or genotypes are responsible for producing the given phenotype.
genetic code
A set of rules by which information encoded within nucleic acids is translated into proteins by living cells. These rules define how sequences of nucleotide triplets called codons specify which amino acid will be added next during protein synthesis. The vast majority of living organisms use the same genetic code (sometimes referred to as the "standard" genetic code) but variant codes do exist.
genetic counseling
genetic disorder
genetic distance
A measure of the genetic divergence between species, populations within a species, or individuals, used especially in phylogenetics to express either the time elapsed since the existence of a common ancestor or the degree of differentiation in the DNA sequences comprising the genomes of each population or individual.
genetic diversity

Sometimes used interchangeably with genetic variation.

The total number of genetic traits or characteristics in the genetic make-up of a population, species, or other group of organisms. It is often used as a measure of the adaptability of a group to changing environments. Genetic diversity is similar to, though distinct from, genetic variability.
genetic drift

Also called allelic drift or the Sewall Wright effect.

A change in the frequency with which an existing allele occurs in a population due to random variation in the distribution of alleles from one generation to the next. It is often interpreted as the role that random chance plays in determining whether a given allele becomes more or less common with each generation, regardless of the influence of natural selection. Genetic drift may cause certain alleles, even otherwise advantageous ones, to disappear completely from the gene pool, thereby reducing genetic variation, or it may cause initially rare alleles, even neutral or deleterious ones, to become much more frequent or even fixed.
genetic engineering
genetic epidemiology
genetic erosion
genetic genealogy
The use of genealogical DNA testing in combination with traditional genealogical methods to infer the level and type of genetic relationships between individuals, find ancestors, and construct family trees, genograms, or other genealogical charts.
genetic hitchhiking

Also called genetic draft or the hitchhiking effect.

genetic marker
A specific, easily identifiable, and usually highly polymorphic gene or other DNA sequence with a known location on a chromosome that can be used to identify the individual or species possessing it.
genetic recombination
genetic regulatory network (GRN)
A graph that represents the regulatory complexity of gene expression. The vertices (nodes) are represented by various regulatory elements and gene products while the edges (links) are represented by their interactions. These network structures also represent functional relationships by approximating the rate at which genes are transcribed.
genetic testing

Also called DNA testing or genetic screening.

A broad class of various procedures used to identify features of an individual's particular chromosomes, genes, or proteins in order to determine parentage or ancestry, diagnose vulnerabilities to heritable diseases, or detect mutant alleles associated with increased risks of developing genetic disorders. Genetic testing is widely used in human medicine, agriculture, and biological research.
genetic variability

Sometimes used interchangeably with genetic variation.

The formation or the presence of individuals differing in genotype within a population or other group of organisms, as opposed to individuals with environmentally induced differences, which cause only temporary, non-heritable changes in phenotype. Barring other limitations, a population with high genetic variability has a greater potential for successful adaptation to changing environmental conditions than a population with low genetic variability. Genetic variability is similar to, though distinct from, genetic diversity.
genetic variation

Sometimes used interchangeably with genetic diversity and genetic variability.

The genetic differences both within and between populations, species, or other groups of organisms. It is often visualized as the variety of different alleles in the gene pools of different populations.
genetically modified organism (GMO)
The field of biology that studies genes, genetic variation, and heredity in living organisms.
The entire complement of genetic material contained within the chromosomes of an organism, organelle, or virus. The term is also used to refer to the collective set of genetic loci shared by every member of a population or species, regardless of the different alleles that may be present at these loci in different individuals.
genome size
The total amount of DNA contained within one copy of a genome, typically measured by mass (in picograms or daltons or by the total number of base pairs (in kilobases or megabases). For diploid organisms, genome size is often used interchangeably with C-value.
genomic DNA (gDNA)

Also called chromosomal DNA.

The DNA contained in chromosomes, as opposed to the extrachromosomal DNA contained in separate structures such as plasmids or organelles such as mitochondria or chloroplasts.
genomic imprinting
An interdisciplinary field that studies the structure, function, evolution, mapping, and editing of entire genomes, as opposed to individual genes.
The entire complement of alleles present in a particular individual's genome, which gives rise to the individual's phenotype.
genotype frequency
The process of determining differences in the genotype of an individual by examining the DNA sequences in the individual's genome using bioassays and comparing them to another individual's sequences or a reference sequence.
germ cell
Any biological cell that gives rise to the gametes of an organism that reproduces sexually. Germ cells are the vessels for the genetic material which will ultimately be passed on to the organism's descendants and are usually distinguished from somatic cells, which are entirely separate from the germ line.
germ line
1.  In multicellular organisms, the population of cells which are capable of passing on their genetic material to the organism's progeny and are therefore (at least theoretically) distinct from somatic cells. The cells of the germ line are called germ cells.
2.  The lineage of germ cells, spanning many generations, that contains the genetic material which has been passed on to an individual from its ancestors.

Abbreviated in shorthand with the letter G.

One of the four main nucleobases present in DNA and RNA. Guanine forms a base pair with cytosine.
guanine-cytosine content

Also abbreviated GC-content.

The proportion of nitrogenous bases in a nucleic acid that are either guanine (G) or cytosine (C), typically expressed as a percentage. DNA and RNA molecules with higher GC-content are generally more thermostable than those with lower GC-content due to molecular interactions that occur during base stacking.[5]


A type of sex-determination system in which sex is determined by the number of sets of chromosomes an individual possesses: offspring which develop from fertilized eggs are females and diploid, while offspring which develop from unfertilized eggs are males and haploid, with half as many chromosomes as the females. Haplodiploidy is common to all members of the insect order Hymenoptera and several other insect taxa.

Denoted in shorthand with the somatic number n.

(of a cell or organism) Having one copy of each chromosome, with each copy not being part of a pair. Contrast diploid and polyploid.
In a diploid organism, having just one allele at a given genetic locus (where there would ordinarily be two). Hemizygosity may be observed when only one copy of a chromosome is present in a normally diploid cell or organism, or when a segment of a chromosome containing one copy of an allele is deleted, or when a gene is located on a sex chromosome in the heterogametic sex (in which the sex chromosomes do not exist in matching pairs); for example, in human males with normal chromosomes, almost all X-linked genes are said to be hemizygous because there is only one X chromosome and few of the same genes exist on the Y chromosome.

Also called inheritance.

The passing on of phenotypic traits from parents to their offspring, either through sexual or asexual reproduction. Offspring cells or organisms are said to inherit the genetic information of their parents.
1.  The ability to be inherited.
2.  A statistic used in quantitative genetics that estimates the proportion of variation within a given phenotypic trait that is due to genetic variation between individuals in a particular population. Heritability is estimated by comparing the individual phenotypes of closely related individuals in the population.
See allosome.
heterogeneous expression

Also called hybrid vigor and outbreeding enhancement.

In a diploid organism, having two different alleles at a given genetic locus. In genetics shorthand, heterozygous genotypes are represented by a pair of non-matching letters or symbols, often an uppercase letter (indicating a dominant allele) and a lowercase letter (indicating a recessive allele), such as "Aa" or "Bb". Contrast homozygous.
Any of a class of highly alkaline proteins responsible for packaging nuclear DNA into structural units called nucleosomes in eukaryotic cells. Histones are the chief protein components of chromatin, where they associate into complexes which act as "spools" around which the linear DNA molecule winds. They play a major role in gene regulation and expression.
homologous chromosomes

Also called homologs.

A set of two matching chromosomes, one maternal and one paternal, which pair up with each other inside the nucleus during meiosis. They have the same genes at the same loci, but may have different alleles.
homologous recombination
A type of genetic recombination in which nucleotide sequences are exchanged between two similar or identical ("homologous") molecules of DNA, especially that which occurs between homologous chromosomes. The term may refer to the recombination that occurs as a part of any of a number of distinct cellular processes, most commonly DNA repair or chromosomal crossover during meiosis in eukaryotes and horizontal gene transfer in prokaryotes. Contrast nonhomologous recombination.
In a diploid organism, having two identical alleles at a given genetic locus. In genetics shorthand, homozygous genotypes are represented by a pair of matching letters or symbols, such as "AA" or "aa". Contrast heterozygous.
horizontal gene transfer (HGT)
housekeeping gene
Any constitutive gene that is transcribed at a relatively constant level across many or all known conditions. Such a gene's products typically serve functions critical to the maintenance of the cell. It is generally assumed that their expression is unaffected by experimental conditions.
Human Genome Project (HGP)
The offspring that results from combining the qualities of two organisms of different genera, species, breeds, or varieties through sexual reproduction. Hybrids may occur naturally or artificially, as during selective breeding of domesticated animals and plants. Though reproductive barriers typically prevent hybridization between distantly related organisms, or at least ensure that hybrid offspring are sterile, fertile hybrids may result in speciation.
1.  The process by which a hybrid organism is produced from two organisms of different genera, species, breeds, or varieties.
2.  The process by which a single-stranded DNA or RNA preparation is added to an array surface, in solution, and potentially anneals to the complementary probe. Note that with respect to a gene expression assay, hybridization refers to a step in the experimental paradigm, while in molecular biology or genetics, the term refers to the chemical process.
hybridization probe


See allosome.
in situ hybridization
Sexual reproduction between breeds or individuals that are closely related genetically. Inbreeding results in homozygosity, which can increase both the probability of offspring being affected by deleterious recessive traits and the probability of fixing beneficial traits within the breeding population. Contrast outbreeding.
incomplete dominance
A term referring to either an insertion or a deletion of one or more bases in a nucleic acid sequence.
A protein that binds to a repressor (to disable it) or to an activator (to enable it).
inducible gene
A gene whose expression is either responsive to environmental change or dependent on its host cell's position within the cell cycle.
See heredity.
A type of mutation in which one or more bases are added to a nucleic acid sequence.
A specific DNA sequence that prevents a gene from being influenced by the activation or repression of nearby genes.
Any nucleotide sequence within a gene that is removed by RNA splicing during post-transcriptional modification of the mRNA primary transcript and therefore absent from the final mature mRNA. The term refers to both the sequence as it exists within a DNA molecule and to the corresponding sequence in RNA transcripts. Contrast exon.
A type of abnormal chromosome in which the arms of the chromosome are mirror images of each other. Isochromosome formation is equivalent to simultaneous duplication and deletion events such that two copies of either the long arm or the short arm comprise the resulting chromosome.


junctional diversity
junk DNA


The karyotype of a human male as visualized in a karyogram using Giemsa staining
The number and appearance of chromosomes within the nucleus of a eukaryotic cell, especially as depicted in an organized photomicrograph known as a karyogram or idiogram (in pairs and ordered by size and by position of the centromere). The term is also used to refer to the complete set of chromosomes in a species or individual organism or to any test that detects this complement or measures the chromosome number.
A genetic engineering technique by which the normal rate of expression of one or more of an organism's genes is reduced, either through direct modification of a DNA sequence or through treatment with a reagent such as a short DNA or RNA oligonucleotide with a sequence complementary to either an mRNA transcript or a gene.
A genetic engineering technique in which an organism is modified to carry genes that have been made inoperative ("knocked out"), such that their expression is disrupted at some point in the pathway that produces their gene products and the organism is deprived of their normal effects. Contrast knockin.


lagging strand
On the lagging strand template, a primase "reads" the template DNA and initiates synthesis of a short complementary RNA primer. A DNA polymerase extends the primed segments, forming Okazaki fragments. The RNA primers are then removed and replaced with DNA, and the fragments of DNA are joined together by DNA ligase.
Law of Dominance
Law of Independent Assortment
Law of Segregation
leading strand
The tendency of DNA sequences which are physically near to each other on the same chromosome to be inherited together during meiosis. Because the physical distance between them is relatively small, the chance that any two nearby parts of a DNA sequence (often loci or genetic markers) will be separated on to different chromatids during chromosomal crossover is statistically very low; such loci are then said to be more linked than loci that are farther apart. Loci that exist on entirely different chromosomes are said to be perfectly unlinked. The standard unit for measuring genetic linkage is the centimorgan (cM).
linkage disequilibrium

Plural loci.

A specific, fixed position on a chromosome where a particular gene or genetic marker resides.
LOD score
long arm

Denoted in shorthand with the symbol q.

In condensed chromosomes where the positioning of the centromere creates two segments of unequal length, the longer of the two segments or "arms" of a chromatid. Contrast short arm.
See X-inactivation.


map unit (m.u.)
See centimorgan.
medical genetics
A specialized type of cell division that occurs exclusively in sexually reproducing eukaryotes, during which DNA replication is followed by two consecutive rounds of division to ultimately produce four genetically unique haploid daughter cells, each with half the number of chromosomes as the original diploid parent cell. Meiosis only occurs in cells of the sex organs, and serves the purpose of generating haploid gametes such as sperm, eggs, or spores, which are later fused during fertilization. The two meiotic divisions, known as Meiosis I and Meiosis II, also include various genetic recombination events between homologous chromosomes.
Mendelian inheritance
A theory of biological inheritance based on a set of principles originally proposed by Gregor Mendel in 1865 and 1866. Mendel derived three generalized laws about the genetic basis of inheritance which, together with several theories developed by later scientists, are considered the foundation of classical genetics.
messenger RNA (mRNA)

Also called environmental genomics, ecogenomics, and community genomics.

The study of genetic material recovered directly from environmental samples, as opposed to organisms cultivated in laboratory cultures.
The stage of mitosis that occurs after prometaphase and before anaphase, during which the centromeres of the replicated chromosomes align along the equator of the cell, with each kinetochore attached to the mitotic spindle.
MicroArray and Gene Expression (MAGE)
A group that "aims to provide a standard for the representation of DNA microarray gene expression data that would facilitate the exchange of microarray information between different data systems".[6]
A type of very small chromosome, generally less than 20,000 base pairs in size, present in the karyotypes of some organisms.
microRNA (miRNA)

Also called a short tandem repeat (STR) and simple sequence repeat (SSR).

Minimum information about a microarray experiment (MIAME)
A commercial standard developed by FGED and based on MAGE in order to facilitate the storage and sharing of gene expression data.[7][8]
Minimal information about a high-throughput sequencing experiment (MINSEQE)
A commercial standard developed by FGED for the storage and sharing of high-throughput sequencing data.[9]
missense mutation
A type of point mutation which results in a codon that codes for a different amino acid than in the unmutated sequence.
mitochondrial DNA (mtDNA)
In eukaryotic cells, the part of the cell cycle during which the division of the nucleus takes place and replicated chromosomes are separated into two distinct nuclei. Mitosis is generally preceded by the S stage of interphase, when the cell's DNA is replicated, and either occurs simultaneously with or is followed by cytokinesis, when the cytoplasm and cell membrane are divided into two new daughter cells.
mobile genetic element (MGE)
Any genetic material that can move between different parts of a genome or be transferred from one species or replicon to another within a single generation. The many types of MGEs include transposable elements (transposons), bacterial plasmids, bacteriophage elements which integrate into host genomes by viral transduction, and self-splicing introns.
molecular genetics
A branch of genetics that employs methods of molecular biology to study the structure and function of genes and gene products at the molecular level.
The abnormal and frequently pathological presence of only one chromosome of a normal diploid pair. It is a type of aneuploidy.
The presence of two or more populations of cells with different genotypes in an individual organism which has developed from a single fertilized egg. A mosaic organism can result from many kinds of genetic phenomena, including nondisjunction of chromosomes, endoreplication, or mutations in individual stem cell lineages during the early development of the embryo. Mosaicism is similar to but distinct from chimerism.
multiple cloning site (MCS)

Also called a polylinker.

Any physical or chemical agent that changes the genetic material, usually DNA, of an organism and thereby increases the frequency of mutations above natural background levels.
1.  The process by which the genetic information of an organism is changed, resulting in a mutation. Mutagenesis may occur spontaneously or as a result of exposure to a mutagen.
2.  In molecular biology, any laboratory technique by which one or more genetic mutations are deliberately engineered in order to produce a mutant gene, regulatory element, gene product, or genetically modified organism so that the functions of a genetic locus, process, or product can be studied in detail.
Any permanent change in the nucleotide sequence of a strand of DNA or RNA. Mutations play a role in both normal and abnormal biological processes, including evolution. They can result from replication errors, molecular damage, or manipulations by mobile genetic elements. Repair mechanisms have evolved in many organisms to correct them.


neutral mutation
1.  Any mutation of a nucleic acid sequence that is neither beneficial nor detrimental to the ability of an organism to survive and reproduce.
2.  Any mutation in which natural selection does not affect the spread of the mutation within a population.
nitrogenous base

Sometimes used interchangeably with nucleobase or simply base.

Any organic compound containing a nitrogen atom that has the chemical properties of a base. A set of five distinct nitrogenous bases – adenine (A), guanine (G), cytosine (C), thymine (T), and uracil (U) – are especially relevant to biology because they are used in the construction of nucleotides, which in turn are the primary monomers that make up nucleic acids.
non-coding DNA
Any segment of DNA that does not encode a sequence that may ultimately be transcribed and translated into a protein. In most organisms, only a small fraction of the genome consists of protein-coding DNA, though the proportion varies greatly between species. Some non-coding DNA may still be transcribed into functional non-coding RNA (as with transfer RNAs) or may serve important developmental or regulatory purposes; other regions (as with so-called "junk DNA") appear to have no known biological function.
non-coding RNA
non-homologous end joining (NHEJ)
The failure of homologous chromosomes or sister chromatids to separate properly during cell division. Nondisjunction results in daughter cells that are aneuploid, containing abnormal numbers of one or more specific chromosomes. It may be caused by any of a variety of factors.
nonhomologous recombination
nonsense mutation

Also called a point-nonsense mutation.

A type of point mutation which results in a premature stop codon in the transcribed mRNA sequence, thereby causing the premature termination of translation and producing a truncated, incomplete, and often non-functional protein.
Northern blotting
nuclear membrane

Also called the nuclear envelope.

A sub-cellular barrier consisting of two lipid bilayer membranes that surrounds the nucleus in eukaryotic cells.
nucleic acid
A long, polymeric macromolecule made up of smaller monomers called nucleotides which are chemically linked to one another in a chain. Two specific types of nucleic acid, DNA and RNA, are used in biological systems to encode the genetic information governing the construction, development, and ordinary processes of all living organisms. The order, or sequence, of the nucleotides in DNA and RNA molecules contains information that is translated into proteins, which direct all of the chemical reactions necessary for life.
nucleic acid sequence
The precise order of consecutively linked nucleotides in a nucleic acid molecule, such as DNA or RNA. Long sequences of nucleotides are the principal means by which biological systems store genetic information, and therefore the accurate replication, transcription, and translation of such sequences is of the utmost importance, lest the information be lost or corrupted. Nucleic acid sequences may be equivalently referred to as sequences of nitrogenous bases, nucleobases, nucleotides, or base pairs, and they correspond directly to sequences of codons and amino acids.

Sometimes used interchangeably with nitrogenous base or simply base.

One of the five primary or canonical nitrogenous basesadenine (A), guanine (G), cytosine (C), thymine (T), and uracil (U) – that form nucleosides and nucleotides, the latter of which are the fundamental building blocks of nucleic acids. The ability of these nucleobases to form base pairs via hydrogen bonding, as well as their flat, compact three-dimensional profiles, allows them to "stack" one upon another and leads directly to the long-chain structures of DNA and RNA.
An organelle within the nucleus of eukaryotic cells which is composed of proteins, DNA, and RNA and serves as the site of ribosome synthesis.
An organic molecule composed exclusively of a nitrogenous base bound to a five-carbon sugar (either ribose or deoxyribose), as opposed to a nucleotide, which additionally includes one or more phosphate groups.
An organic molecule that serves as the monomer or subunit of nucleic acid polymers, including RNA and DNA. Each nucleotide is composed of three constituent parts: a nitrogenous base, a five-carbon sugar (either ribose or deoxyribose), and at least one phosphate group. Though technically distinct, the term "nucleotide" is often used interchangeably with nitrogenous base, nucleobase, and base pair when referring to the sequences that make up nucleic acids. Contrast nucleoside.

Plural nuclei.

A membrane-enclosed organelle found in eukaryotic cells which contains most of the cell's genetic material (organized as chromosomes) and directs the activities of the cell by regulating gene expression.
null allele
Any allele made non-functional by way of a genetic mutation. The mutation may result in the complete failure to produce a gene product or a gene product that does not function properly; in either case, the allele may be considered non-functional.


Okazaki fragments

Also abbreviated oligo.

A short chain of nucleic acid residues. Oligonucleotides are often used to detect the presence of larger mRNA molecules or assembled into two-dimensional microarrays for high-throughput sequence analysis.
A gene that has the potential to cause cancer. In tumor cells, such genes are often mutated and/or expressed at abnormally high levels.
open reading frame (ORF)
The part of a reading frame that has the ability to be translated from DNA or RNA into protein; any continuous stretch of codons that contains a start codon and a stop codon.
origin of replication

Also called outcrossing and crossbreeding.

Sexual reproduction between different breeds or individuals, which has the potential to increase genetic diversity by introducing unrelated genetic material into a breeding population. Contrast inbreeding.
An abnormally high level of gene expression which results in an excessive number of copies of one or more gene products. Overexpression produces a pronounced gene-related phenotype.[10][11]


palindromic sequence
A nucleic acid sequence of a double-stranded DNA or RNA molecule in which the unidirectional sequence (e.g. 5' to 3') of nucleotides on one strand matches the sequence in the same direction (e.g. 5' to 3') on the complementary strand. In other words, a nucleotide sequence is said to be palindromic if it is equal to its reverse complement. Palindromic motifs are common in most genomes and are capable of forming hairpins.
particulate inheritance
pedigree chart
The proportion of individuals with a given genotype who express the associated phenotype, usually given as a percentage. Because of the many complex interactions that govern gene expression, the same allele may produce an observable phenotype in one individual but not in another. If less than 100% of the individuals in a population carrying the genotype of interest also express the associated phenotype, both the genotype and phenotype may be said to show incomplete penetrance. Penetrance quantifies the probability that an allele will result in the expression of its associated phenotype in any form, i.e. to any extent that makes an individual carrier different from individuals without the allele. Compare expressivity.
A short chain of amino acid monomers linked by covalent peptide bonds.
The composite of the observable morphological, physiological, and behavioral traits of an organism that result from the expression of the organism's genotype as well as the influence of environmental factors and the interactions between the two.
phosphate backbone
phosphodiester bond
The pair of ester bonds linking a phosphate with the two pentose rings of consecutive nucleotides on the same strand of a nucleic acid. Each phosphate molecule forms a covalent bond with the 3' carbon of one pentose and the 5' carbon of the adjacent pentose; the repeated series of such bonds that holds together the long chain of nucleotides in DNA and RNA is known as the phosphate or phosphodiester backbone.
The study of the evolutionary history of and relationships between individuals or groups of organisms, such as species or populations, through methods that evaluate observed heritable traits, including morphological features and DNA sequences. The result of such analyses is known as a phylogeny or phylogenetic tree.
Any small DNA molecule that is physically separated from the larger body of chromosomal DNA and can replicate independently. Plasmids are most commonly found as small, circular, double-stranded DNA molecules in prokaryotes such as bacteria, though they are also sometimes present in archaea and eukaryotes.
The phenomenon by which one gene influences two or more seemingly unrelated phenotypic traits, by any of several distinct but potentially overlapping mechanisms.
The number of complete sets of chromosomes in a cell, and hence the number of possible alleles present within the cell at any given autosomal locus.
point mutation

Also called a substitution.

A type of mutation by which a single nucleotide base is changed, inserted, or deleted from a sequence of DNA or RNA.
poly(A) tail
polygenic trait
See multiple cloning site.
polymerase chain reaction (PCR)
A long, continuous, and unbranched polymeric chain of amino acid monomers linked by covalent peptide bonds, typically longer than a peptide. Proteins generally consist of one or more polypeptides arranged in a biologically functional way.
(of a cell or organism) Having more than two homologous copies of each chromosome. Polyploidy may occur as a normal condition of chromosomes in certain cells or even entire organisms, or it may occur as the result of abnormal cell division or a mutation causing the duplication of the entire chromosome set. Contrast haploid and diploid.

Also called a polyribosome or ergosome.

A complex of a messenger RNA molecule and two or more ribosomes which act to translate the mRNA transcript into a polypeptide.
population genetics
A subfield of genetics and evolutionary biology that studies genetic differences within and between populations of organisms.
positional cloning
post-transcriptional modification
post-translational modification
primary transcript
The unprocessed, single-stranded RNA molecule produced by the transcription of a DNA sequence as it exists before post-transcriptional modifications such as alternative splicing convert it into a mature RNA product such as an mRNA, tRNA, or rRNA. A precursor mRNA or pre-mRNA, for example, is a type of primary transcript that becomes a mature mRNA ready for translation after processing.

Also prosposito for a male subject and prosposita for a female subject.

A term used in medical genetics and genealogy to denote a particular subject being studied or reported on.
A reagent used to make a single measurement in a gene expression experiment. Compare reporter.
A collection of two or more probes designed to measure a single molecular species, such as a collection of oligonucleotides designed to hybridize to various parts of the mRNA transcripts generated from a single gene.
A region of DNA that initiates the transcription of a particular gene.
The stage of mitosis occurring after interphase and before prometaphase, during which the DNA of the chromosomes is condensed into chromatin, the nucleolus breaks down, and the mitotic spindle forms.
A linear polymeric macromolecule composed of a series of amino acids linked by peptide bonds. Proteins carry out the majority of the chemical reactions that occur inside living cells.
Punnett square

Also called a purebreed.

A double-ringed heterocyclic organic compound which, along with pyrimidine, is one of two molecules from which all nitrogenous bases (including those used in DNA and RNA) are derived. Adenine (A) and guanine (G) are classified as purines.
putative gene
A specific nucleotide sequence suspected to be a functional gene based on the identification of its open reading frame. The gene is said to be "putative" in the sense that no function has yet been described for its products.
A single-ringed heterocyclic organic compound which, along with purine, is one of two molecules from which all nitrogenous bases (including those used in DNA and RNA) are derived. Cytosine (C), thymine (T), and uracil (U) are classified as pyrimidines.
pyrimidine dimer


quantitative genetics
A branch of population genetics which studies phenotypes that vary continuously (such as height or mass) as opposed to those that fall into discretely identifiable categories (such as eye color or the presence or absence of a particular trait). Quantitative genetics employs statistical methods and concepts to link continuously distributed phenotypic values to specific genotypes and gene products.
quantitative PCR (qPCR)

Also called real-time PCR (rtPCR).

quantitative trait

Also called a complex trait.

quantitative trait locus (QTL)


reading frame
A way of dividing the nucleotide sequence in a DNA or RNA molecule into a set of consecutive, non-overlapping triplets, which is "read" by proteins during transcription and replication. In coding DNA, each triplet is referred to as a codon that corresponds to a particular amino acid during translation. In general, only one reading frame (the so-called open reading frame) in a given section of a nucleic acid can be used to make functional proteins, but there are exceptions in a few organisms. A frameshift mutation results in a shift in the normal reading frame and affects all downstream codons.
A relationship between the alleles of a gene in which one allele produces an effect on phenotype that is overpowered or "masked" by the contribution of another allele at the same locus; the first allele and its associated phenotypic trait are said to be recessive, and the second allele and its associated trait are said to be dominant. Often, recessive alleles code for inefficient or dysfunctional proteins. Like dominance, recessiveness is not an inherent property of any allele or phenotype, but simply describes its relationship to one or more other alleles or phenotypes. In genetics shorthand, recessive alleles are often represented by a lowercase letter (e.g. "a", in contrast to the dominant "A").
recombinant DNA (rDNA)
1.  The process by which certain biological molecules, notably the nucleic acids DNA and RNA, produce copies of themselves.
2.  A technique used to estimate technical and biological variation in experiments for statistical analysis of microarray data. Replicates may be technical replicates, such as dye swaps or repeated array hybridizations, or biological replicates, biological samples from separate experiments that test the effects of the same experimental treatments.
Any molecule or region of DNA or RNA that replicates from a single origin of replication.
A MIAME-compliant term to describe a reagent used to make a single measurement in a gene expression experiment. MIAME defines it as "the nucleotide sequence present in a particular location on the array".[7] A reporter may be a segment of single-stranded DNA that is covalently attached to the array surface. Compare probe.
A DNA-binding protein that decreases the expression of one or more genes by binding to the operator and blocking the attachment of RNA polymerase to the promoter, thus preventing transcription.
response element
A short sequence of DNA within a promoter region that is able to bind specific transcription factors in order to regulate transcription of specific genes.
restriction enzyme
restriction fragment
Any DNA fragment that results from the cutting of a DNA strand by a restriction enzyme at one or more restriction sites.
restriction fragment length polymorphism (RFLP)
restriction site

Also called a restriction recognition site.

reverse transcriptase
ribonucleic acid (RNA)
A polymeric nucleic acid molecule composed of a series of ribonucleotides which incorporate a set of four nucleobases: adenine (A), guanine (G), cytosine (C), and uracil (U). Unlike DNA, RNA is more often found as a single strand folded onto itself, rather than a paired double strand. Various types of RNA molecules serve in a wide variety of essential biological roles, including coding, decoding, regulating, and expressing genes, as well as functioning as signaling molecules and, in certain viral genomes, as the primary genetic material itself.
ribosomal RNA (rRNA)
A molecular complex that serves as the site of protein synthesis. Ribosomes consist of two subunits (the small subunit, which reads the messages encoded in mRNA molecules, and the large subunit, which links amino acids in sequence to form a polypeptide chain), each of which is composed of one or more strands of ribosomal RNA and various ribosomal proteins.
RNA interference
RNA polymerase
RNA splicing


Sanger sequencing
selective sweep
sequence motif
sequence-tagged site (STS)
sex chromosome
See allosome.
sex linkage
short arm

Denoted in shorthand with the symbol p.

In condensed chromosomes where the positioning of the centromere creates two segments of unequal length, the shorter of the two segments or "arms" of a chromatid. Contrast long arm.
short tandem repeat
See microsatellite.
shotgun sequencing
A region of DNA that can be bound by a repressor.
silent mutation
A type of neutral mutation which does not have an observable effect on the organism's phenotype. Though the term "silent mutation" is often used interchangeably with synonymous mutation, synonymous mutations are not always silent, nor vice versa. Missense mutations which result in a different amino acid but one with similar functionality (e.g. leucine instead of isoleucine) are also often classified as silent, since such mutations usually do not significantly affect protein function.
single nucleotide polymorphism (SNP)
single-stranded DNA (ssDNA)
sister chromatids
A pair of identical copies (chromatids) produced as the result of the DNA replication of a chromosome, particularly when both copies are joined together by a common centromere; the pair of sister chromatids is called a dyad. The two sister chromatids are ultimately separated from each other into two different cells during mitosis or meiosis.
small interfering RNA (siRNA)
solenoid fiber
somatic cell

Also called a vegetal cell or soma.

Any biological cell forming the body of an organism, or, in multicellular organisms, any cell other than a gamete, germ cell, or undifferentiated stem cell. Somatic cells are theoretically distinct from cells of the germ line, meaning the mutations they have undergone can never be transmitted to the organism's descendants, though in practice exceptions do exist.
somatic cell nuclear transfer (SCNT)
Southern blot
A molecular biology method used for detecting a specific sequence in DNA samples. The technique combines separation of DNA fragments by gel electrophoresis, transfer of the DNA to a synthetic membrane, and subsequent identification of target fragments with radio-labeled or fluorescent hybridization probes.
spatially-restricted gene expression
The expression of genes only within a specific anatomical region or tissue, often in response to a paracrine signal. The boundary between two spatially-restricted genes can set up a sharp gradient, often expressed phenotypically as striping patterns.
spectral karyotype (SKY)
See genetic engineering.
standard genetic code
The genetic code used by the vast majority of living organisms for translating nucleic acid sequences into proteins. In this system, of the 64 possible permutations of three-letter codons that can be made from the four nucleotides, 61 code for one of the 20 amino acids, and the remaining three code for stop signals. For example, the codon CAG codes for the amino acid glutamine and the codon TAA is a stop codon. The standard genetic code is described as degenerate or redundant because a single amino acid may be coded for by more than one codon.
start codon
The first codon translated by a ribosome from a mature messenger RNA transcript, used as a signal to initiate protein synthesis. In the standard genetic code, the start codon always codes for the same amino acid, methionine, in eukaryotes and for a modified methionine in prokaryotes. The most common start codon is the triplet AUG. Contrast stop codon.
statistical genetics
A branch of genetics concerned with the development of statistical methods for drawing inferences from genetic data. The theories and methodologies of statistical genetics often support research in quantitative genetics, genetic epidemiology, and bioinformatics.

Also called a hairpin or hairpin loop.

stem cell
Any biological cell which has not yet differentiated into a specialized cell type and which can divide through mitosis to produce more stem cells.
sticky end
stop codon

Also called a termination codon.

A codon that signals the termination of protein synthesis during translation of a messenger RNA transcript. In the standard genetic code, three different stop codons are used to dissociate ribosomes from the growing amino acid chain, thereby ending translation: UAG (nicknamed "amber"), UAA ("ochre"), and UGA ("opal"). Contrast start codon.
structural gene
A gene that codes for any protein or RNA product other than a regulatory factor. Structural gene products include enzymes, structural proteins, and certain non-coding RNAs.
1.  Another name for a point mutation.
2.  A type of point mutation in which a single nucleotide base is changed or substituted for another.
swivel point
synonymous mutation

Also called a synonymous substitution.


tandem repeat
A pattern within a nucleic acid sequence in which one or more nucleobases are repeated and the repetitions are directly adjacent (i.e. tandem) to each other. An example is ATGAC ATGAC ATGAC, in which the sequence ATGAC is repeated three times.
TATA box

Also called the Goldberg-Hogness box.

A highly conserved non-coding DNA sequence containing a consensus of repeating T and A base pairs that is commonly found in promoter regions of genes in archaea and eukaryotes. The TATA box often serves as the site of initiation of transcription or as a binding site for transcription factors.
template strand

Abbreviated in shorthand with the letter T.

One of the four nucleobases present in DNA molecules. Thymine forms a base pair with adenine. In RNA, thymine is not used at all, and is instead replaced with uracil.
tissue-specific gene expression
Gene function and expression which is restricted to a particular tissue or cell type. Tissue-specific expression is usually the result of an enhancer which is activated only in the proper cell type.
The first step in the process of gene expression, in which a messenger RNA molecule complementary to a particular gene encoded in DNA is synthesized by enzymes called RNA polymerases. Transcription must be followed by translation before a functional protein can be produced.
transcription factor (TF)
Any protein that controls the rate of transcription of genetic information from DNA to messenger RNA by binding to a specific DNA sequence and promoting or blocking the recruitment of RNA polymerase to nearby genes. Transcription factors can effectively turn "on" and "off" specific genes in order to make sure they are expressed at the right times and in the right places; for this reason, they are a fundamental and ubiquitous mechanism of gene regulation.
transcriptional bursting
The intermittent nature of transcription and translation mechanisms. Both processes occur in "bursts" or "pulses", with periods of gene activity separated by irregular intervals.
transcript of unknown function (TUF)
transfer RNA (tRNA)

Formerly referred to as soluble RNA (sRNA).

A special class of RNA molecule, typically 76 to 90 nucleotides in length, that serves as a physical adapter allowing mRNA transcripts to be translated into sequences of amino acids during protein synthesis. Each tRNA contains a specific anticodon triplet corresponding to an amino acid that is covalently attached to the tRNA's opposite end; as translation proceeds, tRNAs are recruited to the ribosome, where each mRNA codon is paired with a tRNA containing the complementary anticodon. Depending on the organism, cells may employ as many as 41 distinct tRNAs with unique anticodons; because of codon degeneracy within the genetic code, several tRNAs containing different anticodons carry the same amino acid.
Any gene or other segment of genetic material that has been isolated from one organism and then transferred either naturally or by any of a variety of genetic engineering techniques into another organism, especially one of a different species. Transgenes are usually introduced into the second organism's germ line. They are commonly used to study gene function or to confer an advantage not otherwise available in the unaltered organism.
The second step in the process of gene expression, in which the messenger RNA transcript produced during transcription is read by a ribosome to produce a functional protein.
transposable element (TE)

Also called a transposon.



unequal crossing-over

Also called promotion.

Any process, natural or artificial, which increases the rate of gene expression of a certain gene. A gene which is observed to have higher expression (such as by detecting higher levels of its mRNA transcripts) in one sample than in another sample (often a control) is said to be upregulated. Contrast downregulation.
Towards or closer to the 5'-end of a chain of nucleotides. Contrast downstream.

Abbreviated in shorthand with the letter U.

One of the four nucleobases present in RNA molecules. Uracil forms a base pair with adenine. In DNA, uracil is not used at all, and is instead replaced with thymine.


Western blotting
wild type (WT)

Denoted in shorthand by a + superscript.

A term referring to the phenotype of the typical form of a species as it occurs in nature, a product of the standard "normal" allele at a given locus as opposed to that produced by a non-standard mutant allele.
wobble base pairing


X chromosome
One of two sex chromosomes present in organisms which use the XY sex-determination system (and the only sex chromosome in the X0 system). The X chromosome is found in both males and females and typically contains much more gene content than its counterpart, the Y chromosome.
X-linked trait


Y chromosome
One of two sex chromosomes present in organisms which use the XY sex-determination system. The Y chromosome is found only in males and is typically much smaller than its counterpart, the X chromosome.
yeast artificial chromosome (YAC)


A type of eukaryotic cell formed as the direct result of a fertilization event between two gametes. In multicellular organisms, the zygote is the earliest developmental stage.

See also[edit]


  1. ^ "Talking Glossary of Genetic Terms". 8 October 2017. Retrieved 8 October 2017.
  2. ^ Nishikawa, S. (2007). "Reprogramming by the numbers". Nature Biotechnology. 25 (8): 877–878. doi:10.1038/nbt0807-877. PMID 17687365.
  3. ^ Priness, I.; Maimon, O.; Ben-Gal, I. (2007). "Evaluation of gene-expression clustering via mutual information distance measure". BMC Bioinformatics. 8: 111. doi:10.1186/1471-2105-8-111. PMC 1858704. PMID 17397530.
  4. ^ "Functional Genomics Data Society – FGED Society".
  5. ^ Yakovchuk P, Protozanova E, Frank-Kamenetskii MD (2006). "Base-stacking and base-pairing contributions into thermal stability of the DNA double helix". Nucleic Acids Res. 34 (2): 564–74. doi:10.1093/nar/gkj454. PMC 1360284. PMID 16449200.
  6. ^ Rayner TF; Rocca-Serra P; Spellman PT; Causton HC; et al. (2006). "A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB". BMC Bioinformatics. 7: 489. doi:10.1186/1471-2105-7-489. PMC 1687205. PMID 17087822.
  7. ^ a b Oliver S (2003). "On the MIAME Standards and Central Repositories of Microarray Data". Comp. Funct. Genomics. 4 (1): 1. doi:10.1002/cfg.238. PMC 2447402. PMID 18629115.
  8. ^ Brazma A (2009). "Minimum Information About a Microarray Experiment (MIAME)--successes, failures, challenges". ScientificWorldJournal. 9: 420–3. doi:10.1100/tsw.2009.57. PMC 5823224. PMID 19484163.
  9. ^ Functional Genomics Data Society (June 2012). "Minimum Information about a high-throughput SEQuencing Experiment".
  10. ^ "overexpression". Oxford Living Dictionary. Oxford University Press. 2017. Retrieved 18 May 2017. The production of abnormally large amounts of a substance which is coded for by a particular gene or group of genes; the appearance in the phenotype to an abnormally high degree of a character or effect attributed to a particular gene.
  11. ^ "overexpress". NCI Dictionary of Cancer Terms. National Cancer Institute at the National Institutes of Health. 2011-02-02. Retrieved 18 May 2017. overexpress
    In biology, to make too many copies of a protein or other substance. Overexpression of certain proteins or other substances may play a role in cancer development.

Further reading[edit]

External links[edit]