Major histocompatibility complex
The major histocompatibility complex (MHC) is a large genomic region or gene family found in most vertebrates containing many genes with important immune system roles. In humans, the MHC spans almost 4 megabases (4 000 000 base pairs) of chromosome 6 and includes more than 200 known genes, of which about half have known immmunological functions. The MHC complex is divided into three subgroups called MHC class I, MHC class II and MHC class III. The MHC class I encodes heterodimeric peptide binding proteins as well as antigen processing molecule such TAP and Tapasin. The MHC class II encodes heterodimeric peptide binding proteins and proteins that modulate peptide loading onto MHC class II proteins in the lysosomal compartment such MHC II DM, MHC II DQ and MHC II DP. The MHC class III region encodes for other immune components such as complement components (eg. C2, C4, factor B) and some that encode cytokines (eg. TNF-α).
Introduction
Certainly the best known genes in the MHC region are the subset that encodes cell-surface antigen-presenting proteins. In humans, these genes are referred to as human leukocyte antigen (HLA) genes, although people often use the abbreviation MHC to refer to HLA gene products. To disambiguate the usage, some of the biomedical literature uses MHC to refer specifically to the HLA protein molecules and reserves MHC for the region of the genome that encodes for this molecule, however this convention is not consistently adhered to.
The most intensely studied HLA genes are the nine so-called classical MHC genes: HLA-A, HLA-B, HLA-C, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQB1, HLA-DRA, and HLA-DRB1. In humans, the MHC is divided into three regions: Class I, II, and III. The A, B, and C genes belong to MHC class I while the six D genes belong to class II.
Besides being scrutinized by immunologists for its pivotal role in the immune system, the MHC has also attracted the attention of many evolutionary biologists, due to the high levels of allelic diversity found within many of its genes. Indeed, much theory has been devoted to explaining why this particular region of the genome harbors so much diversity, especially in light of its immunological importance.
Molecular biology of MHC proteins
The classical MHC molecules (also referred to as HLA molecules in humans) have a vital role in the complex immunological dialog that must occur between T cells and other cells of the body. At maturity, MHC molecules are anchored in the cell membrane, where they display short polypeptides to T cells, via the T cell receptors (TCRs). The polypeptides may be "self," that is, originating from a protein created by the organism itself, or they may be foreign, originating from bacteria, viruses, pollen, etc. The overarching design of the MHC-TCR interaction is that T cells should ignore self peptides while reacting appropriately to the foreign peptides. Foreign peptides that provoke an immune response arehhy Interestingly, the immune system has another, equally important method to identify antigen: B cells with their membrane-bound antibodies, also known as B cell receptors (BCRs). However, while the BCRs of B cells can bind to antigens without much outside help, the TCRs of T cells require "presentation" of the antigen: this is the job of MHC. It is important to realize that the vast majority of the time, MHC are kept busy presenting self-peptides, which the T cells should appropriately ignore. A full-force immune response usually requires the activation of B cells via BCRs and T cells via the MHC-TCR interaction. This duplicity creates a system of "checks and balances" and underscores the immune system's potential for running amok and causing harm to the body (see autoimmune disorders).
All MHC molecules receive polypeptides from inside the cells they are part of and display them on the cell's exterior surface for recognition by T cells. However, there are major differences between MHC class I and II in the method and outcome of peptide presentation.
MHC class I
MHC class I molecules are found on almost every nucleated cell of the body. MHC class I molecules are heterodimers, consisting of a single transmembrane polypeptide chain (the α-chain) and a β2 microglobulin (which is encoded elsewhere, not in the MHC). The α chain has two polymorphic domains, α1, α2, which binds peptides derived from cytosolic proteins. Because MHC class I molecules present peptides derived from cytosolic proteins, the pathway of MHC class I presentation is often called the cytosolic or endogenous pathway.
The peptides are mainly generated in the cytosol by the proteasome. The proteasome is a macromolecule that consist of 24 subunits of which half of them contain proteolytic activity. The proteasome degrades intracellular proteins into small peptides that are then released into the cytosol. The peptides have to be translocated from the cytosol into the endoplasmic reticulum (ER) to meet the MHC class I molecule which has its peptide binding site in the lumen of the ER.
The peptide translocation from the cytosol into the lumen of the ER is accomplished by the Transporter associated with Antigen Processing (TAP). TAP is a member of the ABC transporter family and is a heterodimeric multimembrane-spanning polypeptide consisting of TAP1 and TAP2. The two subunits form a peptide binding site and two ATP binding sites that face the lumen of the cytosol. TAP binds peptides on the cytoplasmic site and translocates them under ATP consumption into to the lumen of the ER. The MHC class I molecule is then in turn loaded with peptides in the lumen of the ER. The peptide-loading process involves several other molecules which form a large multimeric complex consisting of TAP, tapasin, calreticulin, calnexin, and ER60.
Once the peptide is loaded onto the MHC class I molecule it leaves the ER through the secretory pathway to reach the cell surface. The transport of the MHC class I molecules through the secretory pathway involves several posttranslational modification of the MHC molecule. Some of the posttranslational modifications occur in the ER and involve change to the N-glycan regions of the protein, followed by extensive changes to the N-glycans in the Golgi apparatus. The N-glycans mature fully before they reach the cell surface.
Peptides that fail to bind MHC class I molecules in the lumen of the endoplasmic reticulum are removed from the ER via the sec61 channel into the cytosol were they might undergo further trimming in size and might translocated by TAP back into ER for binding to an MHC class I molecule.
MHC class I molecules are loaded with proteins generated in the cytosol. As viruses infect a cell by entering its cytoplasm, this cytosolic, MHC class I-dependent pathway of antigen presentation is the primary way for a virus-infected cell to signal T cells. MHC class I molecules generally interact exclusively with CD8+ ("cytotoxic") T cells (CTLs). The fate of the virus-infected cell is almost always apoptosis initiated by the CTL, effectively reducing the risk of infecting neighboring cells.
MHC class II
MHC Class II molecules are found only on a few specialized cell types, including macrophages, dendritic cells, activated T cells, and B cells, all of which are professional antigen-presenting cells (APCs). Like MHC class I molecules, class II molecules are also heterodimers, but in this case consist of two homologous peptides, an α and β chain, both of which are encoded in the MHC. The peptides presented by class II molecules are derived from extracellular proteins (not cytosolic as in class I), hence the MHC class II-dependent pathway of antigen presentation is called the endocytic or exogenous pathway. Loading of class II molecules must still occur inside the cell; extracellular proteins are endocytosed, digested in lysosomes, and bound by the class II MHC molecule prior to the molecule's migration to the plasma membrane. Because the peptide-binding groove of MHC class II molecules is open at both ends while the corresponding groove on class I molecules is closed at each end, the peptides presented by MHC class II molecules are longer, generally between 15-24 amino acid residues long.
Because class II MHC is loaded with extracellular proteins, it is mainly concerned with presentation of extracellular pathogens (for example, bacteria that might be infecting a wound or the blood). Class II molecules interact exclusively with CD4+ ("helper") T cells (THs). The helper T cells then help to trigger an appropriate immune response which may include localized inflammation and swelling due to recruitment of phagocytes or may lead to a full-force antibody immune response due to activation of B cells.
MHC evolution and allelic diversity
MHC gene families are found in essentially all vertebrates, though the gene composition and genomic arrangement varies widely. Chickens, for instance, have one of the smallest known MHC regions (19 genes), though most mammals have an MHC structure and composition fairly similar to that of humans. Gene duplication is almost certainly responsible for much of the genic diversity. In humans, the MHC is littered with many pseudogenes.
One of the most striking features of the MHC, particularly in humans, is the astounding allelic diversity found therein and especially among the nine classical genes. In humans, the most conspicuously diverse loci, HLA-A, HLA-B, and HLA-DRB1, have roughly 250, 500, and 300 known alleles respectively -- diversity which is truly exceptional in the human genome. And population surveys of the other classical loci routinely find tens to a hundred alleles -- still highly diverse. And perhaps even more remarkable is that many of these alleles are quite ancient: it is often the case that an allele from a particular HLA gene is more closely related to an allele found in chimpanzees than it is to another human allele from the same gene!
The allelic diversity of MHC genes has created fertile grounds for evolutionary biologists. The most important task for theoreticians is to explain the evolutionary forces that have created and maintained such diversity. Most explanations invoke balancing selection, a broad term which identifies any kind of natural selection in which no single allele is absolutely most fit. Frequency dependent selection and heterozygote advantage are two types of balancing selection that have been suggested to explain MHC allelic diversity.