An expression vector, otherwise known as an expression construct, is usually a plasmid or virus designed for protein expression in cells. The vector is used to introduce a specific gene into a target cell, and can commandeer the cell's mechanism for protein synthesis to produce the protein encoded by the gene. The plasmid is engineered to contain regulatory sequences that act as enhancer and promoter regions and lead to efficient transcription of the gene carried on the expression vector. The goal of a well-designed expression vector is the production of significant amount of stable messenger RNA, and therefore proteins. Expression vectors are basic tools for biotechnology and the production of proteins, for example insulin which is important for medical treatments of diabetes.
- 1 Elements of expression vectors
- 2 Expression systems
- 3 Applications
- 4 See also
- 5 References
Elements of expression vectors
An expression vector has features that any vector may have, such as an origin of replication, a selectable marker, and a suitable site for the insertion of a gene such as the multiple cloning site. The cloned gene may be transferred from a specialized cloning vectors to an expression vector, although it is possible to clone directly into an expression vector. The cloning process is normally performed in Escherichia coli, and vectors used for protein expression in organisms other than E.coli may have, in addition to a suitable origin of replication for its propagation in E. coli, elements that allow them to be maintained in another organism, and these vectors are called shuttle vectors.
Elements for expression
An expression vector must have elements necessary for protein expression. These may include a strong promoter, the correct translation initiation sequence such as a ribosomal binding site and start codon, a strong termination codon, and a transcription termination sequence. There are differences in the machinery for protein synthesis between prokaryotes and eukaryotes, therefore the expression vectors must have the elements for expression that is appropriate for the chosen host. For example, prokaryotes expression vectors would have a Shine-Dalgarno sequence at its translation initiation site for the binding of ribosomes, while eukaryotes expression vectors would contain the Kozak consensus sequence.
The promoter initiates the transcription and is therefore the point of control for the expression of the cloned gene. The promoters used in expression vector are normally inducible, meaning that protein synthesis is only initiated when required by the introduction of an inducer such as IPTG. Protein expression however may also be constitutive (i.e. protein is constantly expressed) in some expression vectors. Low level of constitutive protein synthesis may occur even in expression vectors with tightly controlled promoters.
After the expression of the gene product, it is usually necessary to purify the expressed protein. However, separating the protein of interest from the great majority of proteins of the host cell can be a protracted process. To make this purification process easier, a purification tag may be added to the cloned gene. This tag could be histidine (His) tag, other marker peptides, or a fusion partners such as glutathione S-transferase or maltose-binding protein. Some of these fusion partners may also help to increase the solubility of some expressed proteins. Other fusion proteins such as green fluorescent protein may act as a reporter gene.
The expression vector is transformed or transfected into the host cell for protein synthesis. Some expression vectors may have element for transformation or the insertion of DNA into the host chromosome, for example the vir genes for plant transformation, and integrase sites for chromosomal insertion.
Some vectors may include targeting sequence that may target the expressed protein to a specific location such as the periplasmic space of bacteria.
Different organism may be used to express a target protein, the expression vector used therefore will have elements specific for use in the particular organism. The most commonly used organism for protein expression is the bacterium Escherichia coli. However not all proteins can be successfully expressed in E. coli, and other systems may therefore be used.
The expression host of choice for the expression of many proteins is Escherichia coli as the production of heterologous protein in E. coli is relatively simple and convenient, as well as being rapid and cheap. A large number of E. coli expression plasmids are also available suitable for a wide variety of needs. Other bacteria used for protein expression include Bacillus subtilis.
Most proteins are expressed in the cytoplasm of E. coli, but where necessarily, for example when the protein can only fold correctly in an oxidizing environment due to the presence of disulphide bonds, the protein may be targeted to the periplasmic space by the use of an N-terminal signal sequence. Other more sophisticated systems are being developed; such systems may allow for the expression of proteins previously thought impossible in E. coli, such as glycosylated proteins.
The promoters used for these vector are usually based on the promoter of the lac operon or the T7 promoter, and they are normally regulated by the lac operator. These promoters may also be hybrids of different promoters, for example, the tac promoter is a hybrid of trp and lac promoters. Note that most commonly used lac or lac-derived promoters are based on the lacUV5 mutant which is insensitive to catabolite repression. This mutant allows for expression of protein under the control of the lac promoter when the growth medium contains glucose since glucose would inhibit protein expression if wild-type lac promoter is used. Presence of glucose nevertheless may still be used to reduce background expression through residual inhibition in some systems.
Examples of E. coli expression vectors are the pGEX series of vectors where glutathione-S-transferase is used as a fusion partner and protein expression is under the control of the tac promoter, and the pET series of vectors which uses a T7 promoter.
It is possible to simultaneously express two or more different proteins in E. coli using different plasmids. However, when 2 or more plasmids are used, each plasmid needs to use a different antibiotic selection as well as a different origin of replication, otherwise the plasmids may not be stably maintained. Many commonly-used plasmids are based on the ColE1 replicon and are therefore incompatible with each other; in order for a ColE1-based plasmid to coexist with another in the same cell, the other would need to be of a different replicon, e.g. a p15A replicon-based plasmid such as the pACYC series of plasmids. Another approach would be to use a single two-cistron vector or design the coding sequences in tandem as a bi- or poly-cistronic construct.
A yeast commonly used for protein expression is Pichia pastoris. Examples of yeast expression vector in Pichia are the pPIC series of vectors, and these vectors use the AOX1 promoter which is inducible with methanol. The plasmids may contain elements for insertion of foreign DNA into the yeast genome and signal sequence for the secretion of expressed protein. Proteins with disulphide bonds and glycosylation can be efficiently produced in yeast. Another yeast used for protein expression is Kluyveromyces lactis and the protein is expressed driven by a variant of the strong lactase LAC4 promoter.
Saccharomyces cerevisiae is also commonly used for protein expression, for example in yeast two-hybrid system for the study of protein-protein interaction. The vectors used in yeast two-hybrid system contain fusion partners for two cloned genes that allow the transcription of a reporter gene when there is interaction between the two proteins expressed from the cloned genes.
Baculovirus, a rod-shaped virus which infect insect cells, is used as the expression vector in this system. Insect cell lines derived from Lepidopterans (moths and butterflies), such as Spodoptera frugiperda, are used as host. The shuttle vector is called bacmid, and protein expression is under the control of a strong promoter pPolh. It is normally used for production of glycoproteins, although the glycosylations may be different from those found in vertebrates. It is safer to use than mammalian virus as it has a limited host range and does not infect vertebrates.
Many plant expression vectors are based on the Ti plasmid of Agrobacterium tumefaciens. In these expression vectors, DNA to be inserted into plant is cloned into the T-DNA, a stretch of DNA flanked by a 25-bp direct repeat sequence at either end, and which can integrate into the plant genome. The T-DNA also contains the selectable marker. The Agrobacterium provides a mechanism for transformation, integration of into the plant genome, and the promoters for its vir genes may also be used for the cloned genes.
Some plant viruses may be used as vectors since Agrobacterium does not work for all plants. Examples of plant virus used are the tobacco mosaic virus (TMV), potato virus X, and cowpea mosaic virus. The protein may be expressed as a fusion to the coat protein of the virus and is displayed on the surface of assembled viral particles, or as an unfused protein that accumulates within the plant. Expression in plant using plant vectors is often constitutive, and a commonly used constitutive promoter in plant expression vectors is the cauliflower mosaic virus (CaMV) 35S promoter.
Cultured mammalian cell lines such as the Chinese hamster ovary (CHO), HEK and COS cell lines are used to produce protein. Vectors are transfected into the cells and the DNA may be integrated into the genome by homologous recombination in the case of stable transfection, or the cells may be transiently transfected. Examples of mammalian expression vectors include the adenoviral vectors, the pSV and the pCMV series of vectors. The promoters for cytomegalovirus (CMV) and SV40 are commonly used in mammalian expression vectors to drive protein expression.
E. coli cell lysate containing the cellular components required for transcription and translation are used in this in vitro method of protein expression. The advantage of such system is that protein may be produced much faster than those produced in vivo, but it is also more expensive. Vectors used for E. coli expression can be used in this system although specifically designed vectors for this system are also available. Eukaryotic cell extracts may also be used in other cell-free systems, for example, the wheat germ cell-free expression systems.
Expression vector in an expression host is now the usual method used in laboratories to produce proteins for research. Most proteins are produced in E. coli, but for glycosylated proteins and those with disulphide bonds, yeast, baculovirus and mammalian systems may be used.
Production of peptide and protein pharmaceuticals
Most protein pharmaceuticals are now produced through recombinant DNA technology using expression vectors. These peptide and protein pharmaceuticals may be hormones, vaccines, antibiotics, antibodies, and enzymes. The first human recombinant protein used for disease management was insulin and it was introduced in 1982. Biotechnology allows these peptide and protein pharmaceuticals, some of which were previously rare or difficult to obtain, to be produced in large amount. It also reduces the risks of contaminants such as host viruses, toxins and prions. For example, growth hormone extracted from pituitary glands harvested from human cadavers had caused Creutzfeldt–Jakob disease in patients receiving treatment for dwarfism due to prion contamination, and viral contaminants in clotting factor VIII isolated from human blood had resulted in the transmission of viral diseases such as hepatitis and AIDS, and such risk is reduced or removed completely when these proteins are produced in non-human cell-lines.
Transgenic plant and animals
In recent years, expression vectors have been used to introduce specific genes in organisms, for example in agriculture it is used to produce transgenic plants. Expression vectors have been used to introduce a vitamin A precursor, beta-carotene, into rice plants. This product is called golden rice. This process has also been used to introduce a gene into plants that produces an insecticide, called Bacillus thuringiensis toxin or Bt toxin which reduces the need for farmers to apply insecticides since it is produced by the modified organism. In addition expression vectors are used to extend the ripeness of tomatoes by altering the plant so that it produces less of the chemical that causes the tomatoes to rot. There has been controversy over using expression vectors to modify crops due to the fact that there might be unknown health risks, possibilities of companies patenting certain genetically modified food crops, and ethical concerns. Nevertheless, this technique is still being used and heavily researched.
Transgenic animals have also been produced to study animal biochemical processes and human diseases, or used to produce pharmaceuticals and other proteins. They may also be engineered to have advantageous or useful traits. Green fluorescent protein is sometimes used as tags which results in animal that can fluoresce, and this have been exploited commercially to produce the fluorescent GloFish.
Gene therapy is a promising treatment for a number of diseases where a "normal" gene carried by the vector is inserted into the genome, to replace an "abnormal" gene or supplement the expression of particular gene. Viral vectors are generally used but other nonviral methods of delivery are being developed. The treatment is still a risky option due to the viral vector used which can cause ill-effects, for example giving rise to insertional mutation that can result in cancer. However, there have been promising results.
- Wacker M, Linton D, Hitchen PG, Nita-Lazar M, Haslam SM, North SJ, Panico M, Morris HR, Dell A, Wren BW, Aebi M (2002). "N-linked glycosylation in Campylobacter jejuni and its functional transfer into E. coli". Science 298 (5599): 1790–1793. doi:10.1126/science.298.5599.1790. PMID 12459590.
- Huang CJ, Lin H, Yang X. (2012). "Industrial production of recombinant therapeutics in Escherichia coli and its recent advancements". J Ind Microbiol Biotechnol 39 (3): 383–99. doi:10.1007/s10295-011-1082-9. PMID 22252444.
- Dubendorff JW, Studier FW. (1991). "Controlling basal expression in an inducible T7 expression system by blocking the target T7 promoter with lac repressor". Journal of Molecular Biology 219 (1): 45–59. PMID 1902522.
- deBoer H. A., Comstock, L. J., Vasser, M. (1983). "The tac promoter: a functional hybrid derived from trp and lac promoters". Proceedings of the National Academy of Sciences USA 80 (1): 21–25. PMC 393301. PMID 6337371.
- Silverstone AE, Arditti RR, Magasanik B. (1970). "Catabolite-insensitive revertants of lac promoter mutants". Proceedings of the National Academy of Sciences USA 66 (3): 773–9. PMC 283117. PMID 4913210.
- Robert Novy and Barbara Morris. "Use of glucose to control basal expression in the pET System". inNovations (13): 6–7.
- Smith DB, Johnson KS (1988). "Single-step purification of polypeptides expressed in Escherichia coli as fusions with glutathione S-transferase". Gene 67 (1): 31–40. PMID 3047011.
- "GST Gene Fusion System". Amersham Pharmacia biotech.
- "pGEX Vectors". GE Healthcare Lifesciences.
- "pET System manual". Novagen.
- Nicola Casali, Andrew Preston. E. coli Plasmid Vectors: Methods and Applications. Methods in Molecular Biology. Volume No.: 235. p. 22. ISBN 978-1-58829-151-6.
- "Cloning Methods - Di- or multi-cistronic Cloning". EMBL.
- Schoner BE, Belagaje RM, Schoner RG (1986). "Translation of a synthetic two-cistron mRNA in Escherichia coli". Proc Natl Acad Sci U S A. 83 (22): 8506–10. PMC 386959. PMID 3534891.
- Cregg JM, Cereghino JL, Shi J, Higgins DR. (2000). "Recombinant protein expression in Pichia pastoris". Molecular Biotechnology 16 (1): 23–52. doi:10.1385/MB:16:1:23. PMID 11098467.
- "Pichia pastoris Expression System". Invitrogen.
- "K. lactis Protein Expression Kit". New England BioLabs Inc.
- Fields S, Song O (1989). "A novel genetic system to detect protein-protein interactions". Nature 340 (6230): 245–6. doi:10.1038/340245a0. PMID 2547163.
- "Guide to Baculovirus Expression Vector Systems (BEVS) and Insect Cell Culture Techniques". Invitrogen.
- Walden R, Schell J. (1990). "Techniques in plant molecular biology--progress and problems". European Journal of Biochemistry 192 (3): 563–76. doi:10.1111/j.1432-1033.1990.tb19262.x. PMID 2209611.
- M Carmen Cañizares, Liz Nicholson and George P Lomonossoff (2005). "Use of viral vectors for vaccine production in plants". Immunology and Cell Biology 83: 263–270. doi:10.1111/j.1440-1711.2005.01339.x. PMID 15877604.
- "How Do You Make A Transgenic Plant?". Department of Soil and Crop Sciences at Colorado State University.
- Fütterer J., Bonneville J. M., Hohn T (May 1990). "Cauliflower mosaic virus as a gene expression vector for plants". Physiologia Plantarum 79 (1): 154–157. doi:10.1111/j.1399-3054.1990.tb05878.x.
- Benfey PN, Chua NH. (1990). "The Cauliflower Mosaic Virus 35S Promoter: Combinatorial Regulation of Transcription in Plants". Science 250 (4983): 959–66. doi:10.1126/science.250.4983.959. PMID 17746920.
- Vinarov DA, Newman CL, Tyler EM, Markley JL, Shahan MN. (2006). "Chapter 5:Unit 5.18. Wheat Germ Cell-Free Expression System for Protein Production". Current Protocols in Protein Science. doi:10.1002/0471140864.ps0518s44. ISBN 9780471140863.
- Shayne Cox Gad (2007). Handbook of Pharmaceutical Biotechnology. John Wiley & Sons. p. 693. ISBN 978-0-471-21386-4.
- Alexander Dorozynski (2002). "Parents sue over contaminated human growth hormone". British Medical Journal 324 (7349): 1294. PMC 1123268. PMID 12039815.
- Shayne Cox Gad. Handbook of Pharmaceutical Biotechnology. John Wiley & Sons. p. 738. ISBN 978-0-471-21386-4.
- Bogdanich W, Koli E (2003-05-22). "2 Paths of Bayer Drug in 80's: Riskier One Steered Overseas". The New York Times.
- "Gene therapy". Human Genome Project.
- Ian Sample (17 October 2003). "Doctors discover why gene therapy gave boys cancer". Guardian.
- Sarah Boseley (30 April 2013). "Pioneering gene therapy trials offer hope for heart patients". Guardian.
- Fischer, A.; Hacein-Bey-Abina, S.; Cavazzana-Calvo, M. (2010). "20 years of gene therapy for SCID". Nature Immunology 11 (6): 457–460. doi:10.1038/ni0610-457. PMID 20485269.