An expression vector, otherwise known as an expression construct, is usually a plasmid or virus designed for gene expression in cells. The vector is used to introduce a specific gene into a target cell, and can commandeer the cell's mechanism for protein synthesis to produce the protein encoded by the gene. Expression vectors are the basic tools in biotechnology for the production of proteins.
The vector is engineered to contain regulatory sequences that act as enhancer and promoter regions and lead to efficient transcription of the gene carried on the expression vector. The goal of a well-designed expression vector is the efficient production of protein, and this may be achieved by the production of significant amount of stable messenger RNA, which can then be translated into protein. The expression of a protein may be tightly controlled, and the protein is only produced in significant quantity when necessary through the use of an inducer, in some systems however the protein may be expressed constitutively. Escherichia coli is commonly used as the host for protein production, but other cell types may also be used. An example of the use of expression vector is the production of insulin, which is used for medical treatments of diabetes.
An expression vector has features that any vector may have, such as an origin of replication, a selectable marker, and a suitable site for the insertion of a gene like the multiple cloning site. The cloned gene may be transferred from a specialized cloning vector to an expression vector, although it is possible to clone directly into an expression vector. The cloning process is normally performed in Escherichia coli. Vectors used for protein production in organisms other than E.coli may have, in addition to a suitable origin of replication for its propagation in E. coli, elements that allow them to be maintained in another organism, and these vectors are called shuttle vectors.
Elements for expression
An expression vector must have elements necessary for gene expression. These may include a promoter, the correct translation initiation sequence such as a ribosomal binding site and start codon, a termination codon, and a transcription termination sequence. There are differences in the machinery for protein synthesis between prokaryotes and eukaryotes, therefore the expression vectors must have the elements for expression that are appropriate for the chosen host. For example, prokaryotes expression vectors would have a Shine-Dalgarno sequence at its translation initiation site for the binding of ribosomes, while eukaryotes expression vectors would contain the Kozak consensus sequence.
The promoter initiates the transcription and is therefore the point of control for the expression of the cloned gene. The promoters used in expression vector are normally inducible, meaning that protein synthesis is only initiated when required by the introduction of an inducer such as IPTG. Gene expression however may also be constitutive (i.e. protein is constantly expressed) in some expression vectors. Low level of constitutive protein synthesis may occur even in expression vectors with tightly controlled promoters.
After the expression of the gene product, it may be necessary to purify the expressed protein; however, separating the protein of interest from the great majority of proteins of the host cell can be a protracted process. To make this purification process easier, a purification tag may be added to the cloned gene. This tag could be histidine (His) tag, other marker peptides, or a fusion partners such as glutathione S-transferase or maltose-binding protein. Some of these fusion partners may also help to increase the solubility of some expressed proteins. Other fusion proteins such as green fluorescent protein may act as a reporter gene for the identification of successful cloned genes, or they may be used to study protein expression in cellular imaging.
The expression vector is transformed or transfected into the host cell for protein synthesis. Some expression vectors may have elements for transformation or the insertion of DNA into the host chromosome, for example the vir genes for plant transformation, and integrase sites for chromosomal integration .
Some vectors may include targeting sequence that may target the expressed protein to a specific location such as the periplasmic space of bacteria.
Different organisms may be used to express a gene's target protein, and the expression vector used will therefore have elements specific for use in the particular organism. The most commonly used organism for protein production is the bacterium Escherichia coli. However, not all proteins can be successfully expressed in E. coli, or be expressed with the correct form of post-translational modifications such as glycosylations, and other systems may therefore be used.
The expression host of choice for the expression of many proteins is Escherichia coli as the production of heterologous protein in E. coli is relatively simple and convenient, as well as being rapid and cheap. A large number of E. coli expression plasmids are also available for a wide variety of needs. Other bacteria used for protein production include Bacillus subtilis.
Most heterologous proteins are expressed in the cytoplasm of E. coli. However, not all proteins formed may be soluble in the cytoplasm, and incorrectly folded proteins formed in cytoplasm can form insoluble aggregates called inclusion bodies. Such insoluble proteins will require refolding, which can be an involved process and may not necessarily produce high yield. Proteins which have disulphide bonds are often not able to fold correctly due to the reducing environment in the cytoplasm which prevents such bond formation, and a possible solution is to target the protein to the periplasmic space by the use of an N-terminal signal sequence. Another possibility is to manipulate the redox environment of the cytoplasm. Other more sophisticated systems are also being developed; such systems may allow for the expression of proteins previously thought impossible in E. coli, such as glycosylated proteins.
The promoters used for these vector are usually based on the promoter of the lac operon or the T7 promoter, and they are normally regulated by the lac operator. These promoters may also be hybrids of different promoters, for example, the Tac-Promoter is a hybrid of trp and lac promoters. Note that most commonly used lac or lac-derived promoters are based on the lacUV5 mutant which is insensitive to catabolite repression. This mutant allows for expression of protein under the control of the lac promoter when the growth medium contains glucose since glucose would inhibit gene expression if wild-type lac promoter is used. Presence of glucose nevertheless may still be used to reduce background expression through residual inhibition in some systems.
Examples of E. coli expression vectors are the pGEX series of vectors where glutathione S-transferase is used as a fusion partner and gene expression is under the control of the tac promoter, and the pET series of vectors which uses a T7 promoter.
It is possible to simultaneously express two or more different proteins in E. coli using different plasmids. However, when 2 or more plasmids are used, each plasmid needs to use a different antibiotic selection as well as a different origin of replication, otherwise one of the plasmids may not be stably maintained. Many commonly used plasmids are based on the ColE1 replicon and are therefore incompatible with each other; in order for a ColE1-based plasmid to coexist with another in the same cell, the other would need to be of a different replicon, e.g. a p15A replicon-based plasmid such as the pACYC series of plasmids. Another approach would be to use a single two-cistron vector or design the coding sequences in tandem as a bi- or poly-cistronic construct.
A yeast commonly used for protein production is Pichia pastoris. Examples of yeast expression vector in Pichia are the pPIC series of vectors, and these vectors use the AOX1 promoter which is inducible with methanol. The plasmids may contain elements for insertion of foreign DNA into the yeast genome and signal sequence for the secretion of expressed protein. Proteins with disulphide bonds and glycosylation can be efficiently produced in yeast. Another yeast used for protein production is Kluyveromyces lactis and the gene is expressed, driven by a variant of the strong lactase LAC4 promoter.
Saccharomyces cerevisiae is particularly widely used for gene expression studies in yeast, for example in yeast two-hybrid system for the study of protein-protein interaction. The vectors used in yeast two-hybrid system contain fusion partners for two cloned genes that allow the transcription of a reporter gene when there is interaction between the two proteins expressed from the cloned genes.
Baculovirus, a rod-shaped virus which infects insect cells, is used as the expression vector in this system. Insect cell lines derived from Lepidopterans (moths and butterflies), such as Spodoptera frugiperda, are used as host. A cell line derived from the cabbage looper is of particular interest, as it has been developed to grow fast and without the expensive serum normally needed to boost cell growth. The shuttle vector is called bacmid, and gene expression is under the control of a strong promoter pPolh. Baculovirus has also been used with mammalian cell lines in the BacMam system.
Baculovirus is normally used for production of glycoproteins, although the glycosylations may be different from those found in vertebrates. In general, it is safer to use than mammalian virus as it has a limited host range and does not infect vertebrates without modifications.
Many plant expression vectors are based on the Ti plasmid of Agrobacterium tumefaciens. In these expression vectors, DNA to be inserted into plant is cloned into the T-DNA, a stretch of DNA flanked by a 25-bp direct repeat sequence at either end, and which can integrate into the plant genome. The T-DNA also contains the selectable marker. The Agrobacterium provides a mechanism for transformation, integration of into the plant genome, and the promoters for its vir genes may also be used for the cloned genes. Concerns over the transfer of bacterial or viral genetic material into the plant however have led to the development of vectors called intragenic vectors whereby functional equivalents of plant genome are used so that there is no transfer of genetic material from an alien species into the plant.
Plant viruses may be used as vectors since the Agrobacterium method does not work for all plants. Examples of plant virus used are the tobacco mosaic virus (TMV), potato virus X, and cowpea mosaic virus. The protein may be expressed as a fusion to the coat protein of the virus and is displayed on the surface of assembled viral particles, or as an unfused protein that accumulates within the plant. Expression in plant using plant vectors is often constitutive, and a commonly used constitutive promoter in plant expression vectors is the cauliflower mosaic virus (CaMV) 35S promoter.
Mammalian expression vectors offer considerable advantages for the expression of mammalian proteins over bacterial expression systems - proper folding, post-translational modifications, and relevant enzymatic activity. It may also be more desirable than other eukaryotic non-mammalian systems whereby the proteins expressed may not contain the correct glycosylations. It is of particular use in producing membrane-associating proteins that require chaperones for proper folding and stability as well as containing numerous post-translational modifications. The downside, however, is the low yield of product in comparison to prokaryotic vectors as well as the costly nature of the techniques involved. Its complicated technology, and potential contamination with animal viruses of mammalian cell expression have also placed a constraint on its use in large-scale industrial production.
Cultured mammalian cell lines such as the Chinese hamster ovary (CHO), COS, including human cell lines such as HEK and HeLa may be used to produce protein. Vectors are transfected into the cells and the DNA may be integrated into the genome by homologous recombination in the case of stable transfection, or the cells may be transiently transfected. Examples of mammalian expression vectors include the adenoviral vectors, the pSV and the pCMV series of plasmid vectors, vaccinia and retroviral vectors, as well as baculovirus. The promoters for cytomegalovirus (CMV) and SV40 are commonly used in mammalian expression vectors to drive gene expression. Non-viral promoter, such as the elongation factor (EF)-1 promoter, is also known.
E. coli cell lysate containing the cellular components required for transcription and translation are used in this in vitro method of protein production. The advantage of such system is that protein may be produced much faster than those produced in vivo since it does not require time to culture the cells, but it is also more expensive. Vectors used for E. coli expression can be used in this system although specifically designed vectors for this system are also available. Eukaryotic cell extracts may also be used in other cell-free systems, for example, the wheat germ cell-free expression systems. Mammalian cell-free systems have also been produced.
Expression vector in an expression host is now the usual method used in laboratories to produce proteins for research. Most proteins are produced in E. coli, but for glycosylated proteins and those with disulphide bonds, yeast, baculovirus and mammalian systems may be used.
Production of peptide and protein pharmaceuticals
Most protein pharmaceuticals are now produced through recombinant DNA technology using expression vectors. These peptide and protein pharmaceuticals may be hormones, vaccines, antibiotics, antibodies, and enzymes. The first human recombinant protein used for disease management, insulin, was introduced in 1982. Biotechnology allows these peptide and protein pharmaceuticals, some of which were previously rare or difficult to obtain, to be produced in large quantity. It also reduces the risks of contaminants such as host viruses, toxins and prions. Examples from the past include prion contamination in growth hormone extracted from pituitary glands harvested from human cadavers, which caused Creutzfeldt–Jakob disease in patients receiving treatment for dwarfism, and viral contaminants in clotting factor VIII isolated from human blood that resulted in the transmission of viral diseases such as hepatitis and AIDS. Such risk is reduced or removed completely when the proteins are produced in non-human host cells.
Transgenic plant and animals
In recent years, expression vectors have been used to introduce specific genes into plants and animals to produce transgenic organisms, for example in agriculture it is used to produce transgenic plants. Expression vectors have been used to introduce a vitamin A precursor, beta-carotene, into rice plants. This product is called golden rice. This process has also been used to introduce a gene into plants that produces an insecticide, called Bacillus thuringiensis toxin or Bt toxin which reduces the need for farmers to apply insecticides since it is produced by the modified organism. In addition expression vectors are used to extend the ripeness of tomatoes by altering the plant so that it produces less of the chemical that causes the tomatoes to rot. There have been controversies over using expression vectors to modify crops due to the fact that there might be unknown health risks, possibilities of companies patenting certain genetically modified food crops, and ethical concerns. Nevertheless, this technique is still being used and heavily researched.
Transgenic animals have also been produced to study animal biochemical processes and human diseases, or used to produce pharmaceuticals and other proteins. They may also be engineered to have advantageous or useful traits. Green fluorescent protein is sometimes used as tags which results in animal that can fluoresce, and this have been exploited commercially to produce the fluorescent GloFish.
Gene therapy is a promising treatment for a number of diseases where a "normal" gene carried by the vector is inserted into the genome, to replace an "abnormal" gene or supplement the expression of particular gene. Viral vectors are generally used but other nonviral methods of delivery are being developed. The treatment is still a risky option due to the viral vector used which can cause ill-effects, for example giving rise to insertional mutation that can result in cancer. However, there have been promising results.
- RW Old, SB Primrose (1994). "Chapter 8: Expression E. coli of cloned DNA molecules". Principles of Gene Manipulation. Blackwell Scientific Publications. ISBN 9780632037124.CS1 maint: uses authors parameter (link)
- Michelle E. Kimple, Allison L. Brill, and Renee L. Pasker (24 September 2013). "Overview of Affinity Tags for Protein Purification". Current Protocols in Protein Science. 73 (Unit-9.9): 9.9.1–9.9.23. doi:10.1002/0471140864.ps0909s73. ISBN 9780471140863. PMC 4527311. PMID 24510596.CS1 maint: uses authors parameter (link)
- Erik Snapp (July 2005). "Design and Use of Fluorescent Fusion Proteins in Cell Biology". Current Protocols in Cell Biology. Chapter 21:21.4.1-21.4.13. 27: 21.4.1–21.4.13. doi:10.1002/0471143030.cb2104s27. PMC 2875081. PMID 18228466.CS1 maint: location (link)
- Georgeta Crivat and Justin W. Taraska (January 2012). "Imaging proteins inside cells with fluorescent tags". Trends in Biotechnology. 30 (1): 8–16. doi:10.1016/j.tibtech.2011.08.002. PMC 3246539. PMID 21924508.CS1 maint: uses authors parameter (link)
- Burgess RR (2009). "Refolding solubilized inclusion body proteins". Methods in Enzymology. 463: 259–82. doi:10.1016/S0076-6879(09)63017-2. ISBN 9780123745361. PMID 19892177.
- Julie Lobstein, Charlie A Emrich, Chris Jeans, Melinda Faulkner, Paul Riggs, and Mehmet Berkmen (2012). "SHuffle, a novel Escherichia coli protein expression strain capable of correctly folding disulfide bonded proteins in its cytoplasm". Microbial Cell Factories. 11: 56. doi:10.1186/1475-2859-11-56. PMC 3526497. PMID 22569138.CS1 maint: uses authors parameter (link)
- Wacker M, Linton D, Hitchen PG, Nita-Lazar M, Haslam SM, North SJ, Panico M, Morris HR, Dell A, Wren BW, Aebi M (2002). "N-linked glycosylation in Campylobacter jejuni and its functional transfer into E. coli". Science. 298 (5599): 1790–1793. Bibcode:2002Sci...298.1790W. doi:10.1126/science.298.5599.1790. PMID 12459590.
- Huang CJ, Lin H, Yang X (2012). "Industrial production of recombinant therapeutics in Escherichia coli and its recent advancements". J Ind Microbiol Biotechnol. 39 (3): 383–99. doi:10.1007/s10295-011-1082-9. PMID 22252444. S2CID 15584320.
- Germán L. Rosano1, and Eduardo A. Ceccarelli (2014). "Recombinant protein expression in Escherichia coli: advances and challenges". Frontiers in Microbiology. 5: 172. doi:10.3389/fmicb.2014.00172. PMC 4029002. PMID 24860555.CS1 maint: uses authors parameter (link)
- Dubendorff JW, Studier FW (1991). "Controlling basal expression in an inducible T7 expression system by blocking the target T7 promoter with lac repressor". Journal of Molecular Biology. 219 (1): 45–59. doi:10.1016/0022-2836(91)90856-2. PMID 1902522.
- deBoer H. A., Comstock, L. J., Vasser, M. (1983). "The tac promoter: a functional hybrid derived from trp and lac promoters". Proceedings of the National Academy of Sciences USA. 80 (1): 21–25. Bibcode:1983PNAS...80...21D. doi:10.1073/pnas.80.1.21. PMC 393301. PMID 6337371.CS1 maint: multiple names: authors list (link)
- Silverstone AE, Arditti RR, Magasanik B (1970). "Catabolite-insensitive revertants of lac promoter mutants". Proceedings of the National Academy of Sciences USA. 66 (3): 773–9. Bibcode:1970PNAS...66..773S. doi:10.1073/pnas.66.3.773. PMC 283117. PMID 4913210.
- Robert Novy; Barbara Morris. "Use of glucose to control basal expression in the pET System" (PDF). InNovations (13): 6–7.
- Smith DB, Johnson KS (1988). "Single-step purification of polypeptides expressed in Escherichia coli as fusions with glutathione S-transferase". Gene. 67 (1): 31–40. doi:10.1016/0378-1119(88)90005-4. PMID 3047011.
- "GST Gene Fusion System" (PDF). Amersham Pharmacia biotech.
- "pGEX Vectors". GE Healthcare Lifesciences.
- "pET System manual" (PDF). Novagen.
- Nicola Casali; Andrew Preston (2003-07-03). E. coli Plasmid Vectors: Methods and Applications. Methods in Molecular Biology. Volume No.: 235. p. 22. ISBN 978-1-58829-151-6.
|volume=has extra text (help)
- "Cloning Methods - Di- or multi-cistronic Cloning". EMBL.
- Schoner BE, Belagaje RM, Schoner RG (1986). "Translation of a synthetic two-cistron mRNA in Escherichia coli". Proc Natl Acad Sci U S A. 83 (22): 8506–10. Bibcode:1986PNAS...83.8506S. doi:10.1073/pnas.83.22.8506. PMC 386959. PMID 3534891.
- Cregg JM, Cereghino JL, Shi J, Higgins DR (2000). "Recombinant protein expression in Pichia pastoris". Molecular Biotechnology. 16 (1): 23–52. doi:10.1385/MB:16:1:23. PMID 11098467. S2CID 35874864.
- "Pichia pastoris Expression System" (PDF). Invitrogen.
- "K. lactis Protein Expression Kit" (PDF). New England BioLabs Inc.
- Fields S, Song O (1989). "A novel genetic system to detect protein-protein interactions". Nature. 340 (6230): 245–6. Bibcode:1989Natur.340..245F. doi:10.1038/340245a0. PMID 2547163. S2CID 4320733.
- Mckenzie, Samuel (February 26, 2019). "The Baculovirus Expression Vector System (BEVS)". news-medical.net.
- HINK, W. F. (1970-05-02). "Established Insect Cell Line from the Cabbage Looper, Trichoplusia ni". Nature. 226 (5244): 466–467. Bibcode:1970Natur.226..466H. doi:10.1038/226466b0. ISSN 1476-4687. PMID 16057320. S2CID 4225642.
- Zheng GL, Zhou HX, Li CY (2014). "Serum-free culture of the suspension cell line QB-Tn9-4s of the cabbage looper, Trichoplusia ni, is highly productive for virus replication and recombinant protein expression". Journal of Insect Science. 14 (1): 24. doi:10.1093/jis/14.1.24. PMC 4199540. PMID 25373171.
- "Guide to Baculovirus Expression Vector Systems (BEVS) and Insect Cell Culture Techniques" (PDF). Invitrogen.
- Kost, T; Condreay, JP (2002). "Recombinant baculoviruses as mammalian cell gene-delivery vectors". Trends in Biotechnology. 20 (4): 173–180. doi:10.1016/S0167-7799(01)01911-4. PMID 11906750.
- Walden R, Schell J (1990). "Techniques in plant molecular biology--progress and problems". European Journal of Biochemistry. 192 (3): 563–76. doi:10.1111/j.1432-1033.1990.tb19262.x. PMID 2209611.
- George Acquaah (16 August 2012). Principles of Plant Genetics and Breeding. John Wiley & Sons Inc. ISBN 9781118313695.
- M Carmen Cañizares; Liz Nicholson; George P Lomonossoff (2005). "Use of viral vectors for vaccine production in plants". Immunology and Cell Biology. 83 (3): 263–270. doi:10.1111/j.1440-1711.2005.01339.x. PMC 7165799. PMID 15877604.
- "How Do You Make A Transgenic Plant?". Department of Soil and Crop Sciences at Colorado State University.
- Fütterer J.; Bonneville J. M.; Hohn T (May 1990). "Cauliflower mosaic virus as a gene expression vector for plants". Physiologia Plantarum. 79 (1): 154–157. doi:10.1111/j.1399-3054.1990.tb05878.x.
- Benfey PN, Chua NH (1990). "The Cauliflower Mosaic Virus 35S Promoter: Combinatorial Regulation of Transcription in Plants" (PDF). Science. 250 (4983): 959–66. Bibcode:1990Sci...250..959B. doi:10.1126/science.250.4983.959. PMID 17746920. S2CID 35471862.
- Kishwar Hayat Khan (2013). "Gene Expression in Mammalian Cells and its Applications". Adv Pharm Bull. 3 (2): 257–263. doi:10.5681/apb.2013.042. PMC 3848218. PMID 24312845.
- Berkner KL (1992). "Expression of heterologous sequences in adenoviral vectors". Current Topics in Microbiology and Immunology. 158: 39–66. doi:10.1007/978-3-642-75608-5_3. ISBN 978-3-642-75610-8. PMID 1582245.
- D E Hruby (1990). "Vaccinia virus vectors: new strategies for producing recombinant vaccines". Clin Microbiol Rev. 3 (2): 153–170. doi:10.1128/cmr.3.2.153. PMC 358149. PMID 2187593.
- Kim DW1, Uetsuki T, Kaziro Y, Yamaguchi N, Sugano S (1990). "Use of the human elongation factor 1 alpha promoter as a versatile and efficient expression system". Gene. 91 (2): 217–23. doi:10.1016/0378-1119(90)90091-5. PMID 2210382.CS1 maint: multiple names: authors list (link)
- Vinarov DA, Newman CL, Tyler EM, Markley JL, Shahan MN (2006). "Chapter 5:Unit 5.18. Wheat Germ Cell-Free Expression System for Protein Production". Current Protocols in Protein Science. Chapter 5. pp. 5.18.1–5.18.18. doi:10.1002/0471140864.ps0518s44. ISBN 9780471140863. PMID 18429309. S2CID 12057689.
- Brödel AK1, Wüstenhagen DA, Kubick S (2015). "Cell-free protein synthesis systems derived from cultured mammalian cells". Structural Proteomics. Methods in Molecular Biology. 1261. pp. 129–40. doi:10.1007/978-1-4939-2230-7_7. ISBN 978-1-4939-2229-1. PMID 25502197.CS1 maint: multiple names: authors list (link)
- Shayne Cox Gad (2007). Handbook of Pharmaceutical Biotechnology. John Wiley & Sons. p. 693. ISBN 978-0-471-21386-4.
- Alexander Dorozynski (2002). "Parents sue over contaminated human growth hormone". British Medical Journal. 324 (7349): 1294. doi:10.1136/bmj.324.7349.1294/b. PMC 1123268. PMID 12039815.
- Shayne Cox Gad (2007-05-25). Handbook of Pharmaceutical Biotechnology. John Wiley & Sons. p. 738. ISBN 978-0-471-21386-4.
- Bogdanich W, Koli E (2003-05-22). "2 Paths of Bayer Drug in 80's: Riskier One Steered Overseas". The New York Times: A1, C5. PMID 12812170.
- "bionetonline.org". Archived from the original on 2010-06-17. Retrieved 2010-06-12.
- "Gene therapy". Human Genome Project.
- Ian Sample (17 October 2003). "Doctors discover why gene therapy gave boys cancer". Guardian.
- Sarah Boseley (30 April 2013). "Pioneering gene therapy trials offer hope for heart patients". Guardian.
- Fischer, A.; Hacein-Bey-Abina, S.; Cavazzana-Calvo, M. (2010). "20 years of gene therapy for SCID". Nature Immunology. 11 (6): 457–460. doi:10.1038/ni0610-457. PMID 20485269. S2CID 11300348.