An expression vector, otherwise known as an expression construct, is usually a plasmid or virus designed for protein expression in cells. The vector is used to introduce a specific gene into a target cell, and can commandeer the cell's mechanism for protein synthesis to produce the protein encoded by the gene. The plasmid is engineered to contain regulatory sequences that act as enhancer and promoter regions and lead to efficient transcription of the gene carried on the expression vector. The goal of a well-designed expression vector is the production of significant amount of stable messenger RNA, and therefore proteins. Expression vectors are basic tools for biotechnology and the production of proteins. An example is insulin which is used for medical treatments of diabetes.
- 1 Elements of expression vectors
- 2 Expression systems
- 3 Applications
- 4 See also
- 5 References
- 6 External links
Elements of expression vectors
An expression vector has features that any vector may have, such as an origin of replication, a selectable marker, and a suitable site for the insertion of a gene such as the multiple cloning site. The cloned gene may be transferred from a specialized cloning vector to an expression vector, although it is possible to clone directly into an expression vector. The cloning process is normally performed in Escherichia coli, and vectors used for protein expression in organisms other than E.coli may have, in addition to a suitable origin of replication for its propagation in E. coli, elements that allow them to be maintained in another organism, and these vectors are called shuttle vectors.
Elements for expression
An expression vector must have elements necessary for protein expression. These may include a strong promoter, the correct translation initiation sequence such as a ribosomal binding site and start codon, a strong termination codon, and a transcription termination sequence. There are differences in the machinery for protein synthesis between prokaryotes and eukaryotes, therefore the expression vectors must have the elements for expression that is appropriate for the chosen host. For example, prokaryotes expression vectors would have a Shine-Dalgarno sequence at its translation initiation site for the binding of ribosomes, while eukaryotes expression vectors would contain the Kozak consensus sequence.
The promoter initiates the transcription and is therefore the point of control for the expression of the cloned gene. The promoters used in expression vector are normally inducible, meaning that protein synthesis is only initiated when required by the introduction of an inducer such as IPTG. Protein expression however may also be constitutive (i.e. protein is constantly expressed) in some expression vectors. Low level of constitutive protein synthesis may occur even in expression vectors with tightly controlled promoters.
After the expression of the gene product, it is usually necessary to purify the expressed protein; however, separating the protein of interest from the great majority of proteins of the host cell can be a protracted process. To make this purification process easier, a purification tag may be added to the cloned gene. This tag could be histidine (His) tag, other marker peptides, or a fusion partners such as glutathione S-transferase or maltose-binding protein. Some of these fusion partners may also help to increase the solubility of some expressed proteins. Other fusion proteins such as green fluorescent protein may act as a reporter gene for the identification of successful cloned genes.
The expression vector is transformed or transfected into the host cell for protein synthesis. Some expression vectors may have element for transformation or the insertion of DNA into the host chromosome, for example the vir genes for plant transformation, and integrase sites for chromosomal insertion.
Some vectors may include targeting sequence that may target the expressed protein to a specific location such as the periplasmic space of bacteria.
Different organisms may be used to express a target protein, the expression vector used therefore will have elements specific for use in the particular organism. The most commonly used organism for protein expression is the bacterium Escherichia coli. However not all proteins can be successfully expressed in E. coli, and other systems may therefore be used.
The expression host of choice for the expression of many proteins is Escherichia coli as the production of heterologous protein in E. coli is relatively simple and convenient, as well as being rapid and cheap. A large number of E. coli expression plasmids are also available suitable for a wide variety of needs. Other bacteria used for protein expression include Bacillus subtilis.
Most heterologous proteins are expressed in the cytoplasm of E. coli. However, not all proteins formed may be soluble in the cytoplasm, and incorrectly folded proteins formed in cytoplasm can form insoluble aggregates called inclusion body. Such insoluble proteins will require refolding which can be an involved process and may not produce high yield. Where necessarily, for example when the protein can only fold correctly in an oxidizing environment due to the presence of disulphide bonds, the protein may be targeted to the periplasmic space by the use of an N-terminal signal sequence. Other more sophisticated systems are being developed; such systems may allow for the expression of proteins previously thought impossible in E. coli, such as glycosylated proteins.
The promoters used for these vector are usually based on the promoter of the lac operon or the T7 promoter, and they are normally regulated by the lac operator. These promoters may also be hybrids of different promoters, for example, the tac promoter is a hybrid of trp and lac promoters. Note that most commonly used lac or lac-derived promoters are based on the lacUV5 mutant which is insensitive to catabolite repression. This mutant allows for expression of protein under the control of the lac promoter when the growth medium contains glucose since glucose would inhibit protein expression if wild-type lac promoter is used. Presence of glucose nevertheless may still be used to reduce background expression through residual inhibition in some systems.
Examples of E. coli expression vectors are the pGEX series of vectors where glutathione-S-transferase is used as a fusion partner and protein expression is under the control of the tac promoter, and the pET series of vectors which uses a T7 promoter.
It is possible to simultaneously express two or more different proteins in E. coli using different plasmids. However, when 2 or more plasmids are used, each plasmid needs to use a different antibiotic selection as well as a different origin of replication, otherwise the plasmids may not be stably maintained. Many commonly-used plasmids are based on the ColE1 replicon and are therefore incompatible with each other; in order for a ColE1-based plasmid to coexist with another in the same cell, the other would need to be of a different replicon, e.g. a p15A replicon-based plasmid such as the pACYC series of plasmids. Another approach would be to use a single two-cistron vector or design the coding sequences in tandem as a bi- or poly-cistronic construct.
A yeast commonly used for protein expression is Pichia pastoris. Examples of yeast expression vector in Pichia are the pPIC series of vectors, and these vectors use the AOX1 promoter which is inducible with methanol. The plasmids may contain elements for insertion of foreign DNA into the yeast genome and signal sequence for the secretion of expressed protein. Proteins with disulphide bonds and glycosylation can be efficiently produced in yeast. Another yeast used for protein expression is Kluyveromyces lactis and the protein is expressed driven by a variant of the strong lactase LAC4 promoter.
Saccharomyces cerevisiae is particularly widely used for protein expression studies in yeast, for example in yeast two-hybrid system for the study of protein-protein interaction. The vectors used in yeast two-hybrid system contain fusion partners for two cloned genes that allow the transcription of a reporter gene when there is interaction between the two proteins expressed from the cloned genes.
Baculovirus, a rod-shaped virus which infect insect cells, is used as the expression vector in this system. Insect cell lines derived from Lepidopterans (moths and butterflies), such as Spodoptera frugiperda, are used as host. The shuttle vector is called bacmid, and protein expression is under the control of a strong promoter pPolh. Baculovirus has also been used with mammalian cell lines in the BacMam system.
Baculovirus is normally used for production of glycoproteins, although the glycosylations may be different from those found in vertebrates. In general, it is safer to use than mammalian virus as it has a limited host range and does not infect vertebrates without modifications.
Many plant expression vectors are based on the Ti plasmid of Agrobacterium tumefaciens. In these expression vectors, DNA to be inserted into plant is cloned into the T-DNA, a stretch of DNA flanked by a 25-bp direct repeat sequence at either end, and which can integrate into the plant genome. The T-DNA also contains the selectable marker. The Agrobacterium provides a mechanism for transformation, integration of into the plant genome, and the promoters for its vir genes may also be used for the cloned genes.
Other plant viruses may be used as vectors since Agrobacterium does not work for all plants. Examples of plant virus used are the tobacco mosaic virus (TMV), potato virus X, and cowpea mosaic virus. The protein may be expressed as a fusion to the coat protein of the virus and is displayed on the surface of assembled viral particles, or as an unfused protein that accumulates within the plant. Expression in plant using plant vectors is often constitutive, and a commonly used constitutive promoter in plant expression vectors is the cauliflower mosaic virus (CaMV) 35S promoter.
Mammalian expression vectors offer considerable advantages for the expression of mammalian proteins over bacterial expression systems - proper folding, post-translational modifications, and relevant enzymatic activity. It may also be more desirable than other eukaryotic non-mammalian systems whereby the proteins expressed may not contain the correct glycosylations. It is of particular use in producing membrane-associating proteins that require chaperones for proper folding and stability as well as containing numerous post-translational modifications. The downside, however, is the low yield of product in comparison to prokaryotic vectors as well as the costly nature of the techniques involved. Its complicated technology, and potential contamination with animal viruses of mammalian cell expression have also placed a constraint its use in large-scale industrial production.
Cultured mammalian cell lines such as the Chinese hamster ovary (CHO), HEK, HeLa, and COS cell lines may be used to produce protein. Vectors are transfected into the cells and the DNA may be integrated into the genome by homologous recombination in the case of stable transfection, or the cells may be transiently transfected. Examples of mammalian expression vectors include the adenoviral vectors, the pSV and the pCMV series of plasmid vectors, vaccinia and retroviral vectors, as well as baculovirus. The promoters for cytomegalovirus (CMV) and SV40 are commonly used in mammalian expression vectors to drive protein expression. Non-viral promoter, such as the elongation factor (EF)-1 promoter, is also known.
E. coli cell lysate containing the cellular components required for transcription and translation are used in this in vitro method of protein expression. The advantage of such system is that protein may be produced much faster than those produced in vivo, but it is also more expensive. Vectors used for E. coli expression can be used in this system although specifically designed vectors for this system are also available. Eukaryotic cell extracts may also be used in other cell-free systems, for example, the wheat germ cell-free expression systems. Mammalian cell-free systems have also been produced.
Expression vector in an expression host is now the usual method used in laboratories to produce proteins for research. Most proteins are produced in E. coli, but for glycosylated proteins and those with disulphide bonds, yeast, baculovirus and mammalian systems may be used.
Production of peptide and protein pharmaceuticals
Most protein pharmaceuticals are now produced through recombinant DNA technology using expression vectors. These peptide and protein pharmaceuticals may be hormones, vaccines, antibiotics, antibodies, and enzymes. The first human recombinant protein used for disease management was insulin and it was introduced in 1982. Biotechnology allows these peptide and protein pharmaceuticals, some of which were previously rare or difficult to obtain, to be produced in large amount. It also reduces the risks of contaminants such as host viruses, toxins and prions. For example, growth hormone extracted from pituitary glands harvested from human cadavers had caused Creutzfeldt–Jakob disease in patients receiving treatment for dwarfism due to prion contamination, and viral contaminants in clotting factor VIII isolated from human blood had resulted in the transmission of viral diseases such as hepatitis and AIDS. Such risk is reduced or removed completely when these proteins are produced in non-human cell-lines.
Transgenic plant and animals
In recent years, expression vectors have been used to introduce specific genes into plants and animals to produce transgenic organisms, for example in agriculture it is used to produce transgenic plants. Expression vectors have been used to introduce a vitamin A precursor, beta-carotene, into rice plants. This product is called golden rice. This process has also been used to introduce a gene into plants that produces an insecticide, called Bacillus thuringiensis toxin or Bt toxin which reduces the need for farmers to apply insecticides since it is produced by the modified organism. In addition expression vectors are used to extend the ripeness of tomatoes by altering the plant so that it produces less of the chemical that causes the tomatoes to rot. There has been controversy over using expression vectors to modify crops due to the fact that there might be unknown health risks, possibilities of companies patenting certain genetically modified food crops, and ethical concerns. Nevertheless, this technique is still being used and heavily researched.
Transgenic animals have also been produced to study animal biochemical processes and human diseases, or used to produce pharmaceuticals and other proteins. They may also be engineered to have advantageous or useful traits. Green fluorescent protein is sometimes used as tags which results in animal that can fluoresce, and this have been exploited commercially to produce the fluorescent GloFish.
Gene therapy is a promising treatment for a number of diseases where a "normal" gene carried by the vector is inserted into the genome, to replace an "abnormal" gene or supplement the expression of particular gene. Viral vectors are generally used but other nonviral methods of delivery are being developed. The treatment is still a risky option due to the viral vector used which can cause ill-effects, for example giving rise to insertional mutation that can result in cancer. However, there have been promising results.
- RW Old, SB Primrose. "Chapter 8: Expression E. coli of cloned DNA molecules". Principles of Gene Manipulation. Blackwell Scientific Publications.
- Burgess RR (2009). "Refolding solubilized inclusion body proteins". Methods in Enzymology 463: 259–82. doi:10.1016/S0076-6879(09)63017-2. PMID 19892177.
- Wacker M, Linton D, Hitchen PG, Nita-Lazar M, Haslam SM, North SJ, Panico M, Morris HR, Dell A, Wren BW, Aebi M (2002). "N-linked glycosylation in Campylobacter jejuni and its functional transfer into E. coli". Science 298 (5599): 1790–1793. doi:10.1126/science.298.5599.1790. PMID 12459590.
- Huang CJ, Lin H, Yang X. (2012). "Industrial production of recombinant therapeutics in Escherichia coli and its recent advancements". J Ind Microbiol Biotechnol 39 (3): 383–99. doi:10.1007/s10295-011-1082-9. PMID 22252444.
- Dubendorff JW, Studier FW. (1991). "Controlling basal expression in an inducible T7 expression system by blocking the target T7 promoter with lac repressor". Journal of Molecular Biology 219 (1): 45–59. doi:10.1016/0022-2836(91)90856-2. PMID 1902522.
- deBoer H. A., Comstock, L. J., Vasser, M. (1983). "The tac promoter: a functional hybrid derived from trp and lac promoters". Proceedings of the National Academy of Sciences USA 80 (1): 21–25. doi:10.1073/pnas.80.1.21. PMC 393301. PMID 6337371.
- Silverstone AE, Arditti RR, Magasanik B. (1970). "Catabolite-insensitive revertants of lac promoter mutants". Proceedings of the National Academy of Sciences USA 66 (3): 773–9. doi:10.1073/pnas.66.3.773. PMC 283117. PMID 4913210.
- Robert Novy and Barbara Morris. "Use of glucose to control basal expression in the pET System" (PDF). inNovations (13): 6–7.
- Smith DB, Johnson KS (1988). "Single-step purification of polypeptides expressed in Escherichia coli as fusions with glutathione S-transferase". Gene 67 (1): 31–40. doi:10.1016/0378-1119(88)90005-4. PMID 3047011.
- "GST Gene Fusion System" (PDF). Amersham Pharmacia biotech.
- "pGEX Vectors". GE Healthcare Lifesciences.
- "pET System manual" (PDF). Novagen.
- Nicola Casali, Andrew Preston. E. coli Plasmid Vectors: Methods and Applications. Methods in Molecular Biology. Volume No.: 235. p. 22. ISBN 978-1-58829-151-6.
- "Cloning Methods - Di- or multi-cistronic Cloning". EMBL.
- Schoner BE, Belagaje RM, Schoner RG (1986). "Translation of a synthetic two-cistron mRNA in Escherichia coli". Proc Natl Acad Sci U S A. 83 (22): 8506–10. doi:10.1073/pnas.83.22.8506. PMC 386959. PMID 3534891.
- Cregg JM, Cereghino JL, Shi J, Higgins DR. (2000). "Recombinant protein expression in Pichia pastoris". Molecular Biotechnology 16 (1): 23–52. doi:10.1385/MB:16:1:23. PMID 11098467.
- "Pichia pastoris Expression System" (PDF). Invitrogen.
- "K. lactis Protein Expression Kit" (PDF). New England BioLabs Inc.
- Fields S, Song O (1989). "A novel genetic system to detect protein-protein interactions". Nature 340 (6230): 245–6. doi:10.1038/340245a0. PMID 2547163.
- "Guide to Baculovirus Expression Vector Systems (BEVS) and Insect Cell Culture Techniques" (PDF). Invitrogen.
- Kost, T; Condreay, JP (2002). "Recombinant baculoviruses as mammalian cell gene-delivery vectors". Trends in Biotechnology 20 (4): 173–180. doi:10.1016/S0167-7799(01)01911-4. PMID 11906750.
- Walden R, Schell J. (1990). "Techniques in plant molecular biology--progress and problems". European Journal of Biochemistry 192 (3): 563–76. doi:10.1111/j.1432-1033.1990.tb19262.x. PMID 2209611.
- M Carmen Cañizares, Liz Nicholson and George P Lomonossoff (2005). "Use of viral vectors for vaccine production in plants". Immunology and Cell Biology 83: 263–270. doi:10.1111/j.1440-1711.2005.01339.x. PMID 15877604.
- "How Do You Make A Transgenic Plant?". Department of Soil and Crop Sciences at Colorado State University.
- Fütterer J., Bonneville J. M., Hohn T (May 1990). "Cauliflower mosaic virus as a gene expression vector for plants". Physiologia Plantarum 79 (1): 154–157. doi:10.1111/j.1399-3054.1990.tb05878.x.
- Benfey PN, Chua NH. (1990). "The Cauliflower Mosaic Virus 35S Promoter: Combinatorial Regulation of Transcription in Plants" (PDF). Science 250 (4983): 959–66. doi:10.1126/science.250.4983.959. PMID 17746920.
- Kishwar Hayat Khan (2013). "Gene Expression in Mammalian Cells and its Applications". Adv Pharm Bull. 3 (2): 257–263. doi:10.5681/apb.2013.042. PMC 3848218. PMID 24312845.
- Berkner KL (1992). "Expression of heterologous sequences in adenoviral vectors". Current Topics in Microbiology and Immunology 158: 39–66. PMID 1582245.
- D E Hruby (1990). "Vaccinia virus vectors: new strategies for producing recombinant vaccines". Clin Microbiol Rev. 3 (2): 153–170. PMC 358149. PMID 2187593.
- Kim DW1, Uetsuki T, Kaziro Y, Yamaguchi N, Sugano S (1990). "Use of the human elongation factor 1 alpha promoter as a versatile and efficient expression system". Gene 91 (2): 217–23. PMID 2210382.
- Vinarov DA, Newman CL, Tyler EM, Markley JL, Shahan MN. (2006). "Chapter 5:Unit 5.18. Wheat Germ Cell-Free Expression System for Protein Production". Current Protocols in Protein Science. doi:10.1002/0471140864.ps0518s44. ISBN 9780471140863.
- Brödel AK1, Wüstenhagen DA, Kubick S (2015). "Cell-free protein synthesis systems derived from cultured mammalian cells". Methods Mol Biol 1261: 129–40. doi:10.1007/978-1-4939-2230-7_7. PMID 25502197.
- Shayne Cox Gad (2007). Handbook of Pharmaceutical Biotechnology. John Wiley & Sons. p. 693. ISBN 978-0-471-21386-4.
- Alexander Dorozynski (2002). "Parents sue over contaminated human growth hormone". British Medical Journal 324 (7349): 1294. doi:10.1136/bmj.324.7349.1294/b. PMC 1123268. PMID 12039815.
- Shayne Cox Gad. Handbook of Pharmaceutical Biotechnology. John Wiley & Sons. p. 738. ISBN 978-0-471-21386-4.
- Bogdanich W, Koli E (2003-05-22). "2 Paths of Bayer Drug in 80's: Riskier One Steered Overseas". The New York Times.
- "Gene therapy". Human Genome Project.
- Ian Sample (17 October 2003). "Doctors discover why gene therapy gave boys cancer". Guardian.
- Sarah Boseley (30 April 2013). "Pioneering gene therapy trials offer hope for heart patients". Guardian.
- Fischer, A.; Hacein-Bey-Abina, S.; Cavazzana-Calvo, M. (2010). "20 years of gene therapy for SCID". Nature Immunology 11 (6): 457–460. doi:10.1038/ni0610-457. PMID 20485269.