Beijing Genomics Institute

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Industry Genome sequencing
Founded September 9, 1999 (Beijing)
Headquarters Shenzhen, Guangdong, China
Number of locations
Shenzhen, Hong Kong, Wuhan, Hangzhou, Beijing, China;
Boston, USA;
Copenhagen, Denmark;
Brisbane, Australia
Area served
Key people
Yang Huanming (Chairman)
Wang Jian (President)
Mu Feng (rotating CEO)
Yang Shuang (COO)
Divisions BGI China (Mainland)
BGI Asia Pacific
BGI Americas (North and South America)
BGI Europe (Europe and Africa)

BGI (Chinese: 华大基因; pinyin: Huádà Jīyīn), known as the Beijing Genomics Institute prior to 2008, is one of the world's genome sequencing centers, headquartered in Shenzhen, Guangdong, China.[1]


Wang Jian, Yu Jun, Yang Huanming and Liu Siqi created BGI in November 1999[2] in Beijing, China as a non-governmental independent research institute in order to participate in the Human Genome Project as China's representative.[3][4] After the project was completed, funding dried up. So BGI moved to Hangzhou in exchange for funding from the Hangzhou Municipal Government.

In 2002, BGI sequenced the rice genome which was a cover story in the journal Science. In 2003 BGI decoded the SARS virus genome and created a kit for detection of the virus. In 2003, BGI Hangzhou and the Zhejiang University founded a new research institute, the James D. Watson Institute of Genome Sciences, Zhejiang University. The Watson Institute was intended to become a major center for research and education in East Asia modelled after the Cold Spring Harbor Laboratory in the US.

In 2007 BGI’s headquarters relocated to Shenzhen as "the first citizen-managed, non-profit research institution in China". Yu Jun left BGI at this time purportedly selling his stake to the other 3 founders for a nominal sum.[2] In 2008, BGI-Shenzhen was officially recognized as a state agency.[5] In 2008, BGI published the first human genome of an Asian individual.[3][6]

In 2010 BGI Shenzhen was certified as meeting the requirements of ISO9001:2008 standard for the design and provision of high-throughput sequencing services,[7] The same year BGI bought 128 sequencing machines and claimed to be the world's largest genome center.[3]

In 2010 it was reported that BGI would receive US$1.5 billion in “collaborative funds” over the next 10 years from the China Development Bank.[8][9] In 2010, BGI Americas was established with its main office in Cambridge, Massachusetts[10] and BGI Europe was established in Copenhagen.[11]

In 2011 BGI reported it employed 4,000 scientists and technicians.[1] BGI did the genome sequencing for the deadly 2011 Germany E. coli O104:H4 outbreak in three days under open licence.[12]

In 2013 BGI reported it had relationships with 17 out of the top 20 global pharmaceutical companies[10][13] and advertised that it provided commercial science, health, agricultural, and informatics services to global pharmaceutical companies.[14] That year it bought Complete Genomics of Mountain View, California, a major supplier of DNA sequencing technology, for US$118 million.[12]

The institute has described itself as partly private and partly public, receiving funds both from private investors and the Chinese government. The laboratory was also the Bioinformatics Center of the Chinese Academy of Sciences.

Key achievements[edit]

  • First to de novo sequence and assemble mammalian[15] and human genomes with short-read sequencing (so-called "next generation sequencing")[16]
  • Sequenced the first ancient human’s genome[17]
  • Sequenced the first diploid genome of an Asian individual,[18] as part of the Yan Huang project
  • Initiated building a sequence map of the human pan-genome, estimated to contain 19-40 million bases not in the human reference genome[19][20]
  • Contributed 10% of sequence information for the International HapMap Project
  • BGI's first project was contributing 1% of the Human Genome Project’s reference genome and was the only institute in the developing world to contribute
  • Produced proof-of-principle study for sequencing the microbiome of the human digestive system, an estimated 150 times larger than the human genome[21][22]
  • Key sequencing center in the 1000 Genomes Project
  • First Chinese institution to sequence the Severe acute respiratory syndrome (SARS) virus, just hours after the first sequencing of the virus by Canadians[23]
  • Key player in the analysis of the 2011 E. coli O104:H4 outbreak[24]
  • Sequenced 40 domesticated and wild silkworms, identifying 354 genes likely important in domestication.[25]
  • Sequenced the first giant panda genome,[15] equal in size to the human genome, in less than 8 months[26] Sequencing revealed that the giant panda, Ailuropoda melanoleuca, has a frameshift mutation in a gene involved in sensing savory flavors, T1R1. The mutation might be the genetic reason why the panda prefers bamboo over meat. However, the panda also lacks genes expected for bamboo digestion, so its microbiome might play a key role in metabolizing its main source of food.[15]
  • Key player in the Sino-British Chicken Genome Project
  • As of 2010, plant genomes sequenced include rice, cucumber, soybean, and Sorghum. Animal genomes sequenced include silkworm, honey bee, water flea, lizard, and giant panda. An additional 40 animal and plant species and over 1000 bacteria had also been sequenced.[4][25][27]
  • Nature in 2010 ranked BGI Shenzhen as the fourth among the ten top institutions in China with all the others being universities and the Chinese Academy of Sciences. The ranking was based on articles in Nature research journals. There were similar results for other tops journals.[28]
  • In 2014, BGI was reported to be producing 500 cloned pigs a year to test new medicines.[29]

Current research projects[edit]

Human genetics[edit]

Yan Huang Project[edit]

Started in 2007 and named after two Emperors believed to have founded China’s dominant ethnic group,[30] BGI planned in this project, to sequence at least 100 Chinese individuals to produce a high-resolution map of Chinese genetic polymorphisms.[31][32] The first genome data was published in October 2007.[33] An anonymous Chinese billionaire donated $10 million RMB (about US$1.4 million) to the project and his genome was sequenced at the beginning of the project.[31][32]

The 1000 genomes project[edit]

Diabetes-associated Genes and Variations Study (LUCAMP) Cancer Genome Project[edit]

Nine Danish universities and institutes will collaborate with BGI in this targeted resequencing project.

BGI explores associated genome and gene variation in complexes diseases in large-scale studies primarily using two methods: PCR-based resequencing of candidate genes and exon-capture-based whole exome resequencing.

Cognitive Research Lab[edit]

The Cognitive Research Lab at BGI is working with Stephen Hsu on a project to discover the genetic basis of human intelligence.[34]

Animals and plants[edit]

1,000 Plant and Animal Reference Project[edit]

BGI is leading an international collaboration to sequence 1,000 plants and animals of economic and scientific import within two years. It has pledged an initial US$100 million to start the program.[35]

BGI has already sequenced genomes of 20 species of animals and 9 species of plants—sometimes for multiple individuals, such as 40 silkworms 19713493, and has an equal number underway as of March 2010.

Three Extreme-Environment Animal Genomes Project[edit]

In 2009 BGI-Shenzhen announced the launch of three genome projects that focus on animals living in extreme environments. The three selected genomes are those of two polar animals: the polar bear and emperor penguin, and one altiplano animal: the Tibetan antelope.[36]

International Big Cats Genome Project[edit]

In 2010, BGI, Beijing University, Heilongjiang Manchurian tiger forestry zoo, Kunming Institute of Zoology, San Diego Zoo Institute for Conservation Research in California, and others announced they would sequence the Amur tiger, South China tiger, Bengal tiger, Asiatic lion, African lion, clouded leopard, snow leopard, and other felines. BGI would also sequence the genomes and epigenoms of a liger and tigon. Since the two reciprocal hybrids have different phenotypes, despite being genetically identical, it was expected that the epigenome might reveal the basis of such differences.[37] The project aim was to significantly advance conservation research and was auspiciously announced for the Chinese year of the Tiger.[38]

Results were reported in 2013 for the genomes of the Anur tiger, the white Bengal tiger, African lion, white African lion and snow leopard.[39]

Symbiont Genome Project[edit]

A jointly funded project announced March 19, 2010, BGI will collaborate with Sidney K. Pierce of University of South Florida and Charles Delwiche of the University of Maryland at College Park to sequence the genomes of the sea slug, Elysia chlorotica, and its algal food Vaucheria litorea. The sea slug uses genes from the algae to synthesize chlorophyll, the first interspecies of gene transfer discovered. Sequencing their genomes could elucidate the mechanism of that transfer.[40]


Ten Thousand Microbial Genomes Project[edit]

Bioinformatics technology[edit]

De novo sequencing requires aligning billions of short strings of DNA sequence into a full genome, itself three billion base pairs long for humans.

BGI’s computational biologists developed the first successful algorithm, based on graph theory, for aligning billions of 25 to 75-base pair strings produced by next-generation sequencers, specifically Illumina’s Genome Analyzer, during de novo sequencing. The algorithm, called SOAPdenovo, can assemble a genome in two days[20] and has been used to sequence an array of plant and animal genomes.

BGI’s 500-node supercomputer processes 10 terabytes of raw sequencing data every 24 hours from its current 30 or so Genome Analyzers from Illumina. The annual budget for the computer center is US$9 million.[41]

SOAPdenovo is part of "Short Oligonucleotide Analysis Package" (SOAP), a suite of tools developed by BGI for de novo assembly of human-sized genomes, alignment, SNP detection, resequencing, indel finding, and structural variation analysis. Built for the Illumina sequencers' short reads, SOAPdenovo has been used to assemble multiple human genomes[16][17][18] (identifying an eight kilobase insertion not detected by mapping to the human reference genome[42]) and animals, like the giant panda.[15]

See also[edit]


  1. ^ a b Lone Frank, High-Quality DNA, Apr 24, 2011, The Daily Beast
  2. ^ a b Shu-Ching Jean Chen, (2 September 2013) Genomic Dreams Coming True in China Forbes Asia, Retrieved 27 October 2014
  3. ^ a b c Kevin Davies, (27 September 2011) The Bedrock of BGI: Huanming Yang Bio-IT World, Retrieved 14 January 2014
  4. ^ a b The dragon's DNA, Jun 17th 2010, The Economist
  5. ^ About BGI, BGI
  6. ^ Ye, Jia (2008) An Interview with a Leader in Genomics — Beijing Genomics Institute Asia Biotech, Retrieved 14b January 2013
  7. ^ "Next Generation of High-Throughput Sequencing Service of BGI Received the ISO9001 Certification". 23 March 2010. Retrieved 14 January 2014. 
  8. ^ "BGI to Receive $1.5B in 'Collaborative Funds' Over 10 Years from China Development Bank | In Sequence | Sequencing | GenomeWeb". Retrieved 29 March 2010. 
  9. ^ Fox, J.; Kling, J. (2010). "Chinese institute makes bold sequencing play". Nature Biotechnology. 28 (3): 189–191. doi:10.1038/nbt0310-189c. PMID 20212469. 
  10. ^ a b (2013) Introduction to BGI Americas BGI official web page, Retrieved 14 January 2014
  11. ^ (2013) BGI Europe BGI official web page, Retrieved 14 January 2014
  12. ^ a b Specter, Michael (6 January 2014) The Gene Factory The New Yorker, Retrieved 28 October 2014
  13. ^ Pharma and Biotech Services Introduction, BGI
  14. ^ BGI - Industry, BGI
  15. ^ a b c d Li, R.; Fan, W.; Tian, G.; Zhu, H.; He, L.; Cai, J.; Huang, Q.; Cai, Q.; Li, B.; Bai, Y.; Zhang, Z.; Zhang, Y.; Wang, W.; Li, J.; Wei, F.; Li, H.; Jian, M.; Li, J.; Zhang, Z.; Nielsen, R.; Li, D.; Gu, W.; Yang, Z.; Xuan, Z.; Ryder, O. A.; Leung, F. C. C.; Zhou, Y.; Cao, J.; Sun, X.; Fu, Y. (2009). "The sequence and de novo assembly of the giant panda genome". Nature. 463 (7279): 311–317. Bibcode:2010Natur.463..311L. doi:10.1038/nature08696. PMC 3951497Freely accessible. PMID 20010809. 
  16. ^ a b Li, R.; Zhu, H.; Ruan, J.; Qian, W.; Fang, X.; Shi, Z.; Li, Y.; Li, S.; Shan, G.; Kristiansen, K.; Li, S.; Yang, H.; Wang, J.; Wang, J. (2009). "De novo assembly of human genomes with massively parallel short read sequencing". Genome Research. 20 (2): 265–272. doi:10.1101/gr.097261.109. PMC 2813482Freely accessible. PMID 20019144. 
  17. ^ a b Rasmussen, M.; Li, Y.; Lindgreen, S.; Pedersen, J. S.; Albrechtsen, A.; Moltke, I.; Metspalu, M.; Metspalu, E.; Kivisild, T.; Gupta, R.; Bertalan, M.; Nielsen, K.; Gilbert, M. T. P.; Wang, Y.; Raghavan, M.; Campos, P. F.; Kamp, H. M.; Wilson, A. S.; Gledhill, A.; Tridico, S.; Bunce, M.; Lorenzen, E. D.; Binladen, J.; Guo, X.; Zhao, J.; Zhang, X.; Zhang, H.; Li, Z.; Chen, M.; Orlando, L. (2010). "Ancient human genome sequence of an extinct Palaeo-Eskimo". Nature. 463 (7282): 757–762. Bibcode:2010Natur.463..757R. doi:10.1038/nature08835. PMC 3951495Freely accessible. PMID 20148029. 
  18. ^ a b Wang, J.; Wang, W.; Li, R.; Li, Y.; Tian, G.; Goodman, L.; Fan, W.; Zhang, J.; Li, J.; Zhang, J.; Guo, Y.; Feng, B.; Li, H.; Lu, Y.; Fang, X.; Liang, H.; Du, Z.; Li, D.; Zhao, Y.; Hu, Y.; Yang, Z.; Zheng, H.; Hellmann, I.; Inouye, M.; Pool, J.; Yi, X.; Zhao, J.; Duan, J.; Zhou, Y.; Qin, J. (2008). "The diploid genome sequence of an Asian individual". Nature. 456 (7218): 60–65. Bibcode:2008Natur.456...60W. doi:10.1038/nature07484. PMC 2716080Freely accessible. PMID 18987735. 
  19. ^ Li, R.; Li, Y.; Zheng, H.; Luo, R.; Zhu, H.; Li, Q.; Qian, W.; Ren, Y.; Tian, G.; Li, J.; Zhou, G.; Zhu, X.; Wu, H.; Qin, J.; Jin, X.; Li, D.; Cao, H.; Hu, X.; Blanche, H. L. N.; Cann, H.; Zhang, X.; Li, S.; Bolund, L.; Kristiansen, K.; Yang, H.; Wang, J.; Wang, J. (2009). "Building the sequence map of the human pan-genome". Nature Biotechnology. 28 (1): 57–63. doi:10.1038/nbt.1596. PMID 19997067. 
  20. ^ a b "To Start Building 'Human Pan-Genome,' BGI De Novo Assembles Two Genomes from Illumina Data | In Sequence | Sequencing | GenomeWeb". Retrieved 29 March 2010. 
  21. ^ Qin, J.; Li, R.; Raes, J.; Arumugam, M.; Burgdorf, K. S.; Manichanh, C.; Nielsen, T.; Pons, N.; Levenez, F.; Yamada, T.; Mende, D. R.; Li, J.; Xu, J.; Li, S.; Li, D.; Cao, J.; Wang, B.; Liang, H.; Zheng, H.; Xie, Y.; Tap, J.; Lepage, P.; Bertalan, M.; Batto, J. M.; Hansen, T.; Le Paslier, D.; Linneberg, A.; Nielsen, H. B. R.; Pelletier, E.; Renault, P. (2010). "A human gut microbial gene catalogue established by metagenomic sequencing". Nature. 464 (7285): 59–65. Bibcode:2010Natur.464...59.. doi:10.1038/nature08821. PMC 3779803Freely accessible. PMID 20203603. 
  22. ^ "International Team Catalogs Microbial Genes in the Human Gut | GenomeWeb Daily News | Sequencing | GenomeWeb". Archived from the original on 7 March 2010. Retrieved 29 March 2010. 
  23. ^ Enserink, M. (2003). "SARS IN CHINA: China's Missed Chance". Science. 301 (5631): 294–296. doi:10.1126/science.301.5631.294. PMID 12869735. 
  24. ^ German Teams, BGI and Life Technologies Identify Deadly European E.coli Strain, March 23, 2012 | Bio-IT World
  25. ^ a b Xia, Q.; Guo, Y.; Zhang, Z.; Li, D.; Xuan, Z.; Li, Z.; Dai, F.; Li, Y.; Cheng, D.; Li, R.; Cheng, T.; Jiang, T.; Becquet, C.; Xu, X.; Liu, C.; Zha, X.; Fan, W.; Lin, Y.; Shen, Y.; Jiang, L.; Jensen, J.; Hellmann, I.; Tang, S.; Zhao, P.; Xu, H.; Yu, C.; Zhang, G.; Li, J.; Cao, J.; Liu, S. (2009). "Complete Resequencing of 40 Genomes Reveals Domestication Events and Genes in Silkworm (Bombyx)". Science. 326 (5951): 433–436. Bibcode:2009Sci...326..433X. doi:10.1126/science.1176620. PMID 19713493. 
  26. ^ Cyranoski, D. (2010). "Chinese bioscience: The sequence factory". Nature. 464 (7285): 22–24. doi:10.1038/464022a. PMID 20203579. 
  27. ^ Huang, S.; Li, R.; Zhang, Z.; Li, L.; Gu, X.; Fan, W.; Lucas, W.; Wang, X.; Xie, B.; Ni, P.; Ren, Y.; Zhu, H.; Li, J.; Lin, K.; Jin, W.; Fei, Z.; Li, G.; Staub, J.; Kilian, A.; Van Der Vossen, E. A. G.; Wu, Y.; Guo, J.; He, J.; Jia, Z.; Ren, Y.; Tian, G.; Lu, Y.; Ruan, J.; Qian, W.; Wang, M. (2009). "The genome of the cucumber, Cucumis sativus L". Nature Genetics. 41 (12): 1275–1281. doi:10.1038/ng.475. PMID 19881527. 
  28. ^ BGI Shenzhen Ranked 4th of Top 10 Institutions in NPI 2010 China, BGI
  29. ^ Shukman, David (14 January 2014) China cloning on an 'industrial scale' BBC News Science and Environment, Retrieved 14 January 2014
  30. ^ "Chinese scientists sequence 1st volunteer's genome". People's Daily Online. 7 January 2008. Retrieved 29 October 2014. 
  31. ^ a b Qiu, Jane; Hayden, Check (2008). "Genomics sizes up". Nature. 451 (7176): 234. Bibcode:2008Natur.451..234Q. doi:10.1038/451234a. PMID 18202611. 
  32. ^ a b "BGI Offers Next-Gen Sequencing Service, Kicks Off 100-Genome Sequencing Project | In Sequence | Sequencing | GenomeWeb". Genomeweb LLC. 8 January 2008. Retrieved 29 October 2014. (Subscription required (help)). 
  33. ^ (20 November 2008) TuanHuang - The First Asian Diploid Genome BGI Shenzen web page, Retrieved 29 October 2014
  34. ^ "CGS : New Director's Experience a Plus for MSU, but his Controversial Views Concern Some". 
  35. ^ Fox, J.; Kling, J. (2010). "Chinese institute makes bold sequencing play". Nature Biotechnology. 28 (3): 189–191. doi:10.1038/nbt0310-189c. PMID 20212469. 
  36. ^ "Genome projects launched for three extreme-environment animals". 26 April 2009. Retrieved 22 May 2015. 
  37. ^ "BGI to Sequence Tiger, Lion, and Leopard Species This Year | In Sequence | Sequencing | GenomeWeb". Archived from the original on 28 February 2010. Retrieved 29 March 2010. 
  38. ^ "BGI". Archived from the original on 17 February 2010. Retrieved 29 March 2010. 
  39. ^ Cho, Y. S.; Hu, L.; Hou, H.; Lee, H.; Xu, J.; Kwon, S.; Oh, S.; Kim, H. M.; Jho, S.; Kim, S.; Shin, Y. A.; Kim, B. C.; Kim, H.; Kim, C. U.; Luo, S. J.; Johnson, W. E.; Koepfli, K. P.; Schmidt-Küntzel, A.; Turner, J. A.; Marker, L.; Harper, C.; Miller, S. M.; Jacobs, W.; Bertola, L. D.; Kim, T. H.; Lee, S.; Zhou, Q.; Jung, H. J.; Xu, X.; et al. (2013). "The tiger genome and comparative analysis with lion and snow leopard genomes". Nature Communications. 4: 2433. Bibcode:2013NatCo...4E2433C. doi:10.1038/ncomms3433. PMC 3778509Freely accessible. PMID 24045858. 
  40. ^ "BGI". Archived from the original on 21 August 2010. Retrieved 29 March 2010. 
  41. ^ Petsko, G. A. (2010). "Rising in the East". Genome Biology. 11 (1): 102. doi:10.1186/gb-2010-11-1-102. PMC 2847708Freely accessible. PMID 20156314. 
  42. ^ "BGI Uses New Short-Read Algorithm to Assemble Panda Genome as Proof of Concept for Human Genome | BioInform | Informatics | GenomeWeb". Retrieved 28 March 2010. 

External links[edit]