Protein production

From Wikipedia, the free encyclopedia
(Redirected from Recombinant protein)
Central dogma depicting transcription from DNA code to RNA code to the proteins in the second step covering the production of protein.
Central dogma depicting transcription from DNA code to RNA code to the proteins in the second step covering the production of protein.

Protein production is the biotechnological process of generating a specific protein. It is typically achieved by the manipulation of gene expression in an organism such that it expresses large amounts of a recombinant gene. This includes the transcription of the recombinant DNA to messenger RNA (mRNA), the translation of mRNA into polypeptide chains, which are ultimately folded into functional proteins and may be targeted to specific subcellular or extracellular locations.[1]

Protein production systems (also known as expression systems) are used in the life sciences, biotechnology, and medicine. Molecular biology research uses numerous proteins and enzymes, many of which are from expression systems; particularly DNA polymerase for PCR, reverse transcriptase for RNA analysis, restriction endonucleases for cloning, and to make proteins that are screened in drug discovery as biological targets or as potential drugs themselves. There are also significant applications for expression systems in industrial fermentation, notably the production of biopharmaceuticals such as human insulin to treat diabetes, and to manufacture enzymes.

Protein production systems[edit]

Commonly used protein production systems include those derived from bacteria,[2][3] yeast,[4][5] baculovirus/insect,[6] mammalian cells,[7][8] and more recently filamentous fungi such as Myceliophthora thermophila.[9] When biopharmaceuticals are produced with one of these systems, process-related impurities termed host cell proteins also arrive in the final product in trace amounts.[10]

Cell-based systems[edit]

The oldest and most widely used expression systems are cell-based and may be defined as the "combination of an expression vector, its cloned DNA, and the host for the vector that provide a context to allow foreign gene function in a host cell, that is, produce proteins at a high level".[11][12] Overexpression is an abnormally and excessively high level of gene expression which produces a pronounced gene-related phenotype.[13][14][clarification needed]

There are many ways to introduce foreign DNA to a cell for expression, and many different host cells may be used for expression — each expression system has distinct advantages and liabilities. Expression systems are normally referred to by the host and the DNA source or the delivery mechanism for the genetic material. For example, common hosts are bacteria (such as E. coli, B. subtilis), yeast (such as S. cerevisiae[5]) or eukaryotic cell lines. Common DNA sources and delivery mechanisms are viruses (such as baculovirus, retrovirus, adenovirus), plasmids, artificial chromosomes and bacteriophage (such as lambda). The best expression system depends on the gene involved, for example the Saccharomyces cerevisiae is often preferred for proteins that require significant posttranslational modification. Insect or mammal cell lines are used when human-like splicing of mRNA is required. Nonetheless, bacterial expression has the advantage of easily producing large amounts of protein, which is required for X-ray crystallography or nuclear magnetic resonance experiments for structure determination.

Because bacteria are prokaryotes, they are not equipped with the full enzymatic machinery to accomplish the required post-translational modifications or molecular folding. Hence, multi-domain eukaryotic proteins expressed in bacteria often are non-functional. Also, many proteins become insoluble as inclusion bodies that are difficult to recover without harsh denaturants and subsequent cumbersome protein-refolding.

To address these concerns, expressions systems using multiple eukaryotic cells were developed for applications requiring the proteins be conformed as in, or closer to eukaryotic organisms: cells of plants (i.e. tobacco), of insects or mammalians (i.e. bovines) are transfected with genes and cultured in suspension and even as tissues or whole organisms, to produce fully folded proteins. Mammalian in vivo expression systems have however low yield and other limitations (time-consuming, toxicity to host cells,..). To combine the high yield/productivity and scalable protein features of bacteria and yeast, and advanced epigenetic features of plants, insects and mammalians systems, other protein production systems are developed using unicellular eukaryotes (i.e. non-pathogenic 'Leishmania' cells).

Bacterial systems[edit]

Escherichia coli[edit]
E. coli, one of the most popular hosts for artificial gene expression.

E. coli is one of the most widely used expression hosts, and DNA is normally introduced in a plasmid expression vector. The techniques for overexpression in E. coli are well developed and work by increasing the number of copies of the gene or increasing the binding strength of the promoter region so assisting transcription.[3]

For example, a DNA sequence for a protein of interest could be cloned or subcloned into a high copy-number plasmid containing the lac (often LacUV5) promoter, which is then transformed into the bacterium E. coli. Addition of IPTG (a lactose analog) activates the lac promoter and causes the bacteria to express the protein of interest.[2]

E. coli strain BL21 and BL21(DE3) are two strains commonly used for protein production. As members of the B lineage, they lack lon and OmpT proteases, protecting the produced proteins from degradation. The DE3 prophage found in BL21(DE3) provides T7 RNA polymerase (driven by the LacUV5 promoter), allowing for vectors with the T7 promoter to be used instead.[15]


Non-pathogenic species of the gram-positive Corynebacterium are used for the commercial production of various amino acids. The C. glutamicum species is widely used for producing glutamate and lysine,[16] components of human food, animal feed and pharmaceutical products.

Expression of functionally active human epidermal growth factor has been done in C. glutamicum,[17] thus demonstrating a potential for industrial-scale production of human proteins. Expressed proteins can be targeted for secretion through either the general, secretory pathway (Sec) or the twin-arginine translocation pathway (Tat).[18]

Unlike gram-negative bacteria, the gram-positive Corynebacterium lack lipopolysaccharides that function as antigenic endotoxins in humans.[citation needed]

Pseudomonas fluorescens[edit]

The non-pathogenic and gram-negative bacteria, Pseudomonas fluorescens, is used for high level production of recombinant proteins; commonly for the development bio-therapeutics and vaccines. P. fluorescens is a metabolically versatile organism, allowing for high throughput screening and rapid development of complex proteins. P. fluorescens is most well known for its ability to rapid and successfully produce high titers of active, soluble protein.[19]

Eukaryotic systems[edit]


Expression systems using either S. cerevisiae or Pichia pastoris allow stable and lasting production of proteins that are processed similarly to mammalian cells, at high yield, in chemically defined media of proteins.[4][5]

Filamentous fungi[edit]

Filamentous fungi, especially Aspergillus and Trichoderma, have long been used to produce diverse industrial enzymes from their own genomes ("native", "homologous") and from recombinant DNA ("heterologous").[9]

More recently, Myceliophthora thermophila C1 has been developed into an expression platform for screening and production of native and heterologous proteins.The expression system C1 shows a low viscosity morphology in submerged culture, enabling the use of complex growth and production media. C1 also does not "hyperglycosylate" heterologous proteins, as Aspergillus and Trichoderma tend to do.[9]

Baculovirus-infected cells[edit]

Baculovirus-infected insect cells[20] (Sf9, Sf21, High Five strains) or mammalian cells[21] (HeLa, HEK 293) allow production of glycosylated or membrane proteins that cannot be produced using fungal or bacterial systems.[20][6] It is useful for production of proteins in high quantity. Genes are not expressed continuously because infected host cells eventually lyse and die during each infection cycle.[22]

Non-lytic insect cell expression[edit]

Non-lytic insect cell expression is an alternative to the lytic baculovirus expression system. In non-lytic expression, vectors are transiently or stably transfected into the chromosomal DNA of insect cells for subsequent gene expression.[23][24] This is followed by selection and screening of recombinant clones.[25] The non-lytic system has been used to give higher protein yield and quicker expression of recombinant genes compared to baculovirus-infected cell expression.[24] Cell lines used for this system include: Sf9, Sf21 from Spodoptera frugiperda cells, Hi-5 from Trichoplusia ni cells, and Schneider 2 cells and Schneider 3 cells from Drosophila melanogaster cells.[23][25] With this system, cells do not lyse and several cultivation modes can be used.[23] Additionally, protein production runs are reproducible.[23][24] This system gives a homogeneous product.[24] A drawback of this system is the requirement of an additional screening step for selecting viable clones.[25]


Leishmania tarentolae (cannot infect mammals) expression systems allow stable and lasting production of proteins at high yield, in chemically defined media. Produced proteins exhibit fully eukaryotic post-translational modifications, including glycosylation and disulfide bond formation.[citation needed]

Mammalian systems[edit]

The most common mammalian expression systems are Chinese Hamster ovary (CHO) and Human embryonic kidney (HEK) cells.[26][27][28]

Cell-free systems[edit]

Cell-free production of proteins is performed in vitro using purified RNA polymerase, ribosomes, tRNA and ribonucleotides. These reagents may be produced by extraction from cells or from a cell-based expression system. Due to the low expression levels and high cost of cell-free systems, cell-based systems are more widely used.[29]

See also[edit]


  1. ^ Gräslund S, Nordlund P, Weigelt J, Hallberg BM, Bray J, Gileadi O, et al. (February 2008). "Protein production and purification". Nature Methods. 5 (2): 135–46. doi:10.1038/nmeth.f.202. PMC 3178102. PMID 18235434.
  2. ^ a b Baneyx F (October 1999). "Recombinant protein expression in Escherichia coli". Current Opinion in Biotechnology. 10 (5): 411–21. doi:10.1016/s0958-1669(99)00003-8. PMID 10508629.
  3. ^ a b Rosano, Germán; Ceccarelli, Eduardo (2014-04-17). "Recombinant protein expression in Escherichia coli: advances and challenges". Frontiers in Microbiology. 5: 172. doi:10.3389/fmicb.2014.00172. PMC 4029002. PMID 24860555.
  4. ^ a b Cregg JM, Cereghino JL, Shi J, Higgins DR (September 2000). "Recombinant protein expression in Pichia pastoris". Molecular Biotechnology. 16 (1): 23–52. doi:10.1385/MB:16:1:23. PMID 11098467. S2CID 35874864.
  5. ^ a b c Malys N, Wishart JA, Oliver SG, McCarthy JE (2011). "Protein production in Saccharomyces cerevisiae for systems biology studies". Methods in Systems Biology. Methods in Enzymology. Vol. 500. pp. 197–212. doi:10.1016/B978-0-12-385118-5.00011-6. ISBN 9780123851185. PMID 21943899.
  6. ^ a b Kost TA, Condreay JP, Jarvis DL (May 2005). "Baculovirus as versatile vectors for protein expression in insect and mammalian cells". Nature Biotechnology. 23 (5): 567–75. doi:10.1038/nbt1095. PMC 3610534. PMID 15877075.
  7. ^ Rosser MP, Xia W, Hartsell S, McCaman M, Zhu Y, Wang S, Harvey S, Bringmann P, Cobb RR (April 2005). "Transient transfection of CHO-K1-S using serum-free medium in suspension: a rapid mammalian protein expression system". Protein Expression and Purification. 40 (2): 237–43. doi:10.1016/j.pep.2004.07.015. PMID 15766864.
  8. ^ Lackner A, Genta K, Koppensteiner H, Herbacek I, Holzmann K, Spiegl-Kreinecker S, Berger W, Grusch M (September 2008). "A bicistronic baculovirus vector for transient and stable protein expression in mammalian cells". Analytical Biochemistry. 380 (1): 146–8. doi:10.1016/j.ab.2008.05.020. PMID 18541133.
  9. ^ a b c Visser H, Joosten V, Punt PJ, Gusakov AV, Olson PT, Joosten R, et al. (June 2011). "Development of a mature fungal technology and production platform for industrial enzymes based on a Myceliophthora thermophila isolate, previously known as Chrysosporium lucknowense C1". Industrial Biotechnology. 7 (3): 214–223. doi:10.1089/ind.2011.7.214. Aspergillus and Trichoderma are currently the main fungal genera used to produce industrial enzymes.
  10. ^ Wang, Xing; Hunter, Alan K.; Mozier, Ned M. (2009-06-15). "Host cell proteins in biologics development: Identification, quantitation and risk assessment". Biotechnology and Bioengineering. 103 (3): 446–458. doi:10.1002/bit.22304. ISSN 0006-3592. PMID 19388135. S2CID 22707536.
  11. ^ "Definition: expression system". Online Medical Dictionary. Centre for Cancer Education, University of Newcastle upon Tyne: Cancerweb. 1997-11-13. Retrieved 2008-06-10.
  12. ^ "Expression system - definition". Biology Online. 2005-10-03. Retrieved 2008-06-10.
  13. ^ "overexpression". Oxford Living Dictionary. Oxford University Press. 2017. Archived from the original on February 10, 2018. Retrieved 18 May 2017. The production of abnormally large amounts of a substance which is coded for by a particular gene or group of genes; the appearance in the phenotype to an abnormally high degree of a character or effect attributed to a particular gene.
  14. ^ "overexpress". NCI Dictionary of Cancer Terms. National Cancer Institute at the National Institutes of Health. 2011-02-02. Retrieved 18 May 2017. overexpress
    In biology, to make too many copies of a protein or other substance. Overexpression of certain proteins or other substances may play a role in cancer development.
  15. ^ Jeong, H; Barbe, V; Lee, CH; Vallenet, D; Yu, DS; Choi, SH; Couloux, A; Lee, SW; Yoon, SH; Cattolico, L; Hur, CG; Park, HS; Ségurens, B; Kim, SC; Oh, TK; Lenski, RE; Studier, FW; Daegelen, P; Kim, JF (11 December 2009). "Genome sequences of Escherichia coli B strains REL606 and BL21(DE3)". Journal of Molecular Biology. 394 (4): 644–52. doi:10.1016/j.jmb.2009.09.052. PMID 19786035.
  16. ^ Brinkrolf K, Schröder J, Pühler A, Tauch A (September 2010). "The transcriptional regulatory repertoire of Corynebacterium glutamicum: reconstruction of the network controlling pathways involved in lysine and glutamate production". Journal of Biotechnology. 149 (3): 173–82. doi:10.1016/j.jbiotec.2009.12.004. PMID 19963020.
  17. ^ Date M, Itaya H, Matsui H, Kikuchi Y (January 2006). "Secretion of human epidermal growth factor by Corynebacterium glutamicum". Letters in Applied Microbiology. 42 (1): 66–70. doi:10.1111/j.1472-765x.2005.01802.x. PMID 16411922.
  18. ^ Meissner D, Vollstedt A, van Dijl JM, Freudl R (September 2007). "Comparative analysis of twin-arginine (Tat)-dependent protein secretion of a heterologous model protein (GFP) in three different Gram-positive bacteria". Applied Microbiology and Biotechnology. 76 (3): 633–42. doi:10.1007/s00253-007-0934-8. PMID 17453196. S2CID 6238466.
  19. ^ Retallack DM, Jin H, Chew L (February 2012). "Reliable protein production in a Pseudomonas fluorescens expression system". Protein Expression and Purification. 81 (2): 157–65. doi:10.1016/j.pep.2011.09.010. PMID 21968453.
  20. ^ a b Altmann F, Staudacher E, Wilson IB, März L (February 1999). "Insect cells as hosts for the expression of recombinant glycoproteins". Glycoconjugate Journal. 16 (2): 109–23. doi:10.1023/A:1026488408951. PMID 10612411. S2CID 34863069.
  21. ^ Kost TA, Condreay JP (October 1999). "Recombinant baculoviruses as expression vectors for insect and mammalian cells". Current Opinion in Biotechnology. 10 (5): 428–33. doi:10.1016/S0958-1669(99)00005-1. PMID 10508635.
  22. ^ Yin J, Li G, Ren X, Herrler G (January 2007). "Select what you need: a comparative evaluation of the advantages and limitations of frequently used expression systems for foreign genes". Journal of Biotechnology. 127 (3): 335–47. doi:10.1016/j.jbiotec.2006.07.012. PMID 16959350.
  23. ^ a b c d Dyring, Charlotte (2011). "Optimising the drosophila S2 expression system for production of therapeutic vaccines". BioProcessing Journal. 10 (2): 28–35. doi:10.12665/j102.dyring.
  24. ^ a b c d Olczak M, Olczak T (December 2006). "Comparison of different signal peptides for protein secretion in nonlytic insect cell system". Analytical Biochemistry. 359 (1): 45–53. doi:10.1016/j.ab.2006.09.003. PMID 17046707.
  25. ^ a b c McCarroll L, King LA (October 1997). "Stable insect cell cultures for recombinant protein production". Current Opinion in Biotechnology. 8 (5): 590–4. doi:10.1016/s0958-1669(97)80034-1. PMID 9353223.
  26. ^ a b Zhu J (2012-09-01). "Mammalian cell protein expression for biopharmaceutical production". Biotechnology Advances. 30 (5): 1158–70. doi:10.1016/j.biotechadv.2011.08.022. PMID 21968146.
  27. ^ a b c d Almo SC, Love JD (June 2014). "Better and faster: improvements and optimization for mammalian recombinant protein production". Current Opinion in Structural Biology. New constructs and expression of proteins / Sequences and topology. 26: 39–43. doi:10.1016/ PMC 4766836. PMID 24721463.
  28. ^ Hacker DL, Balasubramanian S (June 2016). "Recombinant protein production from stable mammalian cell lines and pools". Current Opinion in Structural Biology. New constructs and expression of proteins • Sequences and topology. 38: 129–36. doi:10.1016/ PMID 27322762.
  29. ^ Rosenblum G, Cooperman BS (January 2014). "Engine out of the chassis: cell-free protein synthesis and its uses". FEBS Letters. 588 (2): 261–8. doi:10.1016/j.febslet.2013.10.016. PMC 4133780. PMID 24161673.

Further reading[edit]

External links[edit]