BGI Research, relying on strong scientific research capabilities, has participated in a number of advanced international scientific researches, and has obtained major breakthroughs in many areas.

At present, BGI Research mainly focuses on genomics, transcriptomics, proteomics, metabolomics and other omics research studies, which would have implications on health care, breeding, bio-energy and other areas of basic research and applied research in the field of life science. Taking human, animals, plants and microbes as the subjects of research, BGI Research deeply explores the core technologies on DNA/ RNA levels, histone modifications level, network / systems biology level, population and evolutionary level, metabolomics levels, synthetic biology level, single cell level and with continuous development.

To maintain the core competitiveness, BGI Research will continue to work on four areas of focus: personal genomics, cancer, reproductive health, and molecular breeding. 

This area of research is based on previous studies on gene structures, gene functions, and relative relationships with diseases, At personal genomics level, to analysis and interpret all the health-related information from a genetic point-of-view. This interpretation includes:

· Screening for monogenetic diseases

· Risk prediction of polygenic diseases

· Consultancy on medical treatment for certain diseases

· Construction and maintenance of databases on human mutations, drug targets, and etc.

Construction of reference metagenome in human gut

MetaHIT was a project founded by the European Commission under the 7th FP program. The project was initiated on January 1, 2008 and completed on June 30, 2012. It aimed to explore associations between the genes of the human intestinal microbiota with  health and disease. Two disorders of increasing importance in the whole world. Inflammatory Bowel Disease (IBD) and obesity, were studied in this project. As a major player of the project, BGI constructed the research platform, developed several bioinformatics algorithms and made majority contribution on the work of sequencing data assembly, annotation, variation indentification and function analysis. In 2010, MetaHIT research was published as cover-story in Nature: A human gut microbial gene catalogue established by metagenomic sequencing. In this study, we analyzed samples from 124 individuals of Danish and Spanish origin, some were healthy and some, suffering from IBD or obesity. An extensive bio-informatics analysis showed a staggering number of 3.3 million different genes among the individuals that we analyzed, 150-fold more than in our own genome. More than 85% of all the frequent genes that the 124 individuals carry were identified at the value determined by an appropriate statistical analysis. Some 99% of the genes were of bacterial origin, in accordance with the predominance of bacteria among the intestinal microbes. According to the gene number, was suggested that there are at least 1000 bacterial species frequent in our gut and about 160 species are in a dominant position, more interestingly, most of these 160 species are shared in different individuals. But on the other side, a large part in these genes still couldn’t be perfectly aligned to a reference bacterial genome, which means they belong to unknown bacteria species. The achievement of MetaHIT, an era of metagenomics with rapid progress in techniques, more international cooperated projects in this domain, and a huge number of new bacteria or new genes to be discovered.

Cancer is one of the major diseases threatening human health nowadays. As a complex disease, its pathogenesis, classification markers, and evolution progress are all pressing scientific problems that need to be resolved; also diversity among individual cancer patients remains main challenge to modern clinical treatments. With the development of "multi-omics" technologies, cancer research is undergoing a revolution. Comprehensive systemic studies in cancer based on molecular biology, will facilitate the development and application of new technologies and methods for cancer prevention, diagnosis and treatment, thus laying the cornerstone for eventual eradication of cancer.

BGI-Research Cancer Division is established with particular emphasize on personalized medicine, which adheres to the P4 (predictive, preventive, personalized and participation) medical treatment theory. Based on important society demands, BGI-Research Cancer Division has its orientation, with "Healthy China 2020 Strategic Study Report" as theoretical and policy-wise basis. Through studies on integration of multi-omics and oncology, the Cancer Division strives to clarify the genetic mechanisms of cancer development, discover biomarkers of differentiate molecular subtypes, and develop translated applications for clinical use.

The cancer study team gathers a number of interdisciplinary talents specialized in molecular genetics, genomics, bioinformatics, clinical medicine, chemistry and engineering, with advisory board composed by reputable researchers, doctors and professionals from top institutions, hospitals worldwide. At present, the main work focuses on research and development of cancer basis study, cancer translational study, andmulti-omics core technologies. 

Systems Biology Study on Tumor Trans-omics

Systems biology is a field of study that attempts to integrate different levels of information (DNA, RNA, protein, etc.) to understand the functions and mechanisms of biological systems. Through the study of the correlations and interactions among various parts of biological system, one is expected to construct comprehensible model that can accurately describe the system as a whole. In the new century, with technology development of information processing and biological experiment, people begin studies on various types of diseases. The systems biology study with tumor trans-omics utilizes new technologies in multi-omics level to record the genetic variations during tumorigenesis, evolution growth,, metastasis, and drug resistance process, which facilitates the development of  systemic  analysis tools and softwares for trans-omics in turn, to provide systems biology model support for clinical non-invasive tumor detection.

Single Cell Study on Tumor Cell Lineage, Selection and Phenotypic Relationship

In March 2012, BGI published two momentous papers on the same journal of international academic journal "Cell". The cancer team developed a new method for single cell genome analysis, and applied it to identify the genetic characteristics of essential thrombocythemia (ET, a kind of blood neoplasm) and clear cell renal cell carcinoma (ccRCC, a typical kidney cancer), and demonstrated that single cell analyses of highly heterogeneous tissues provided much clearer intratumoral genetic pictures and developmental history than previous bulk tissue sequencing.

This new method has solved previous challenges associated with the high heterogeneity of tumor tissue samples, and inspired new thinking and research direction for further study of tumorigenesisgrowth, diagnosis, and treatment at single base solution. The establishment of single cell sequencing method laid the foundation for the cancer team in tumor cell study. This year, the team will continue to carry out single cell study of tumor cell lineage, selection and phenotypic relationship. The main purposes of these studies will be developing more advanced and accurate single cell analysis methods, which include single cell bioinformatics pipeline optimization, single cell experimental method updating, finding novel key driving factors in tumor evolution, building integrated application model for tumor cell lineage, selection, and phenotype studies, thus finally structuring studies on  functions, drug targets, and monoclonal antibodies of selected genes and mutations. 

Multidisciplinary Bio-X Platform Construction Based on Microfluidics

Microfluidic technology can integrate elementary units of sample collection, separation, reaction, and detection on a micron scale chip for auto-completion during biological, chemical and medical tasks. Because of its great performance on integration level, sensitivity, efficiency, decreased cost, microfluidic technology has displayed great potential in biological, chemical, and medical fields, and has formed a brand new interdisciplinary field of technologies involving biology, chemistry, medicine, electronics, fluid dynamics, materialogy, machinery, etc. The cancer team has invited researchers from reputable labs worldwide, contributing the work of developing microfluidic technology, and establishing a multidisciplinary crossed platform (Bio-X); thus, forming the core capabilities of prototype development, hypothesis verification and manufacture translation based on it. With an aim to create a micro-scaled, automated, high-throughput, integrate sample extracting library constructing system, it will supply an open supporting platform for sustainable development for BGI on science, technology and industry. 

Translational Study on Personalized Cancer Medical Care

Personalized cancer medical care, based on individual characteristics of genetics and cancer genomics, provides individualized medication guidance for patients undergoing personalized therapeutic treatments. It can help patients to choose the appropriate chemotherapies and target drugs, improve treating performance, and maximally prolong survival time. Translational cancer research also includes pharmaco genomics analysis of drugs and pharma dynamics prediction, which serves in early screening, examination and prognostic monitoring of cancers

BGI Research has established collaborations with Guangzhou Medical University First Affiliated Hospital, Shanghai Changhai Hospital, Beijing Tiantan Hospital, SYSU Cancer Center and a number of domestic medical institutions in personalized cancer research, to promote the medical translation of cancer genomics achievements, and facilitate the application of the latest cancer research outcomes to clinical treatments. So far, BGI Research has achieved initial phased objectives in lung cancer, prostate cancer, glioma and thyroid cancer.

Non-invasive Prenatal Test

In China, prenatal screening for Down syndrome is based on maternal serum screening and ultrasound screening, which could only detect about 90% of  trisomy 21 fetuses with high false-positive rate of about 5%. Karyo typing of fetal cells obtained from either chorionic villus sampling (CVS) or amniocentesis is the gold standard of prenatal diagnosis. However, these invasive procedures may carry miscarriage rates around 0.5% to 1%. The discovery of cell-free fetal DNA (cff-DNA) in maternal peripheral blood brings a new opportunity to non-invasive prenatal detection of fetal chromosomal aneuploidies.

Based on massively parallel sequencing of DNA in maternal plasma, fetal DNA distribution on chromosomes can be recovered and used for fetal chromosomal aneuploidies detection. Theoretically, this method can be applied for all chromosomal aneuploidies detection, and has already been validated in large scale clinical practice on trisomy 21, 18 and 13. Reproductive Health Research focuses on non-invasive prenatal genetic testing techniques, which are under expanding, optimizing and upgrading to cover diverse detecting fields of other chromosomal abnormalities, the second-trimester fetal monogenic/polygenic genetic defects, HLA typing and twin diseases, etc.

Congenital genetic defects

20% to 25% of people suffer from genetic diseases in China, including chromosomal diseases, monogenic disorders, and polygenic diseases. Normal population carries an average of 5 to 6 recessive mutations (carriers), and passes to their offsprings in a certain proportion, thereby leading to birth defects. Hundreds of chromosomal aberrations have been identified to cause diseases, and Online Mendelian Inheritance in Man database (OMIM) has enrolled 21,821 genes related to monogenic disorders, covering about 3780 kinds of diseases with phenotype description and known molecular basis. Most genetic diseases can not be treated effectively yet, and will bring severe economic burden to both society and families.

Reproductive health research has carried out genetic testing to the high incidence of birth defects diseases in recent years, such as congenital heart disease, cleft lip and palate, mental retardation, metabolic disorders, chromosomal abnormalities, and immune deficiency, etc., applying high-throughput sequencing technology to detect the related diseases at chromosome and gene level, through blood, amniotic fluid, tissue and other samples. With the test results, the researchers can establish and update the database about gene mutations and normal frequencies, then find out high incidence and damaging genetic disease genes, and analyze the pathogenic mechanisms involved, which will help to prevent birth defects, and also improve neonatal viability and life quality.


In China, The number of infertile patients (with male or female factors) has exceeded 50 million, and the incidence of recurrent spontaneous abortion (RSA) is also rising year after year. With the development of reproductive medicine, assisted reproductive technology (ART) has become one of the most important approaches for infertility treatment. BGI Research has a focus on the scientific research for reproductive medicine at chromosomal and genetic level using next generation sequencing (NGS). By exploring genetic factors of infertility, discovering related molecular mechanism and interaction networks of target genes, we can provide the foundation for clinical treatment and intervention. For the couples undergoing IVF procedures, preimplantation genetic diagnosis/screening (PGD/PGS) provides an effective approach to increase the success rate of IVF. By detecting embryonic chromosomal abnormalities with NGS and selecting the normal embryo for transfer into uterus, PGD/PGS can improve the live birth rate in IVF/ICSI and reduce RSA and birth defects.

Stem Cell

Stem cells hold great promise for new advances in the understanding of diseases mechanisms and the development of new drugs and therapies. In addition, it is a promising approach for cell therapy especially for degenerative diseases. Based on the robust cell preservation platform and high-quality laboratory devices, BGI Research performs cell preservation, stem cell differentiation, embryonic development, cell genetics and iPSCs, contributing to cell therapy, regenerative medicine and clinical application.


Reproductive health research also center focuses on reproductive tract infection, intrauterine infection, children's respiratory and intestinal infection, in order to clarify the important detriment of pathogens on reproductive health in each period and explore novel pathogenic microorganisms, that contributing to clinical therapeutic and vaccine design. It promises to develop a novel method for rapid detection of main pathogen of children’s acute pneumonic fever or diarrhea with unclear etiopathogenesis.

Molecular breeding lab in BGI Research focuses on many important crops breeding, especially the transformative progress from genomes to molecular breeding. The goal of the lab is to build a molecular breeding platform based on sequencing and genome analyzing, then supply the future breeding with new methods.

Progress in Rice

The Innovative Institute was established in September 2011, when the project to resequence 3000 core rice germplasm was initiated.

·  Resequencing 3000 core rice germplasm. To date, resequencing of 3200 rice has been finished, and a proportion of the data has been analyzed. The average sequencing coverage in each sample is up to 14X. Now, work is in progress to identify the SNP and construct the HapMap, which will be used for guiding the breeding of rice.

·  Molecular breeding of rice based on the genomics. With the resequencing data of 3000 rice germplasm, the highest density molecular marker system has been developed, which can be used for locating gene loci of certain important traits and screening rice population. Applying this key technique, population analysis of several super-rices were completed, many loci of important traits have been located successfully, such as the grain number per spike, setting rate, height, flag leave width, filled-grain percentage, pest resistance, grain length and so on. Now, we are trying to merge all of these traits to one crop to build a new super rice strain.

Progress in Millet

In the early 2010, BGI and Zhangjiakou Academy of Agriculture Science established collaboration on the Foxtail Millet Genome. The initial result was published in Nature Biotechnology, (Genome sequence of foxtail millet (Setariaitalica) provides insights into grass evolution and biofuel potential. Nature Biotechnology, 2012, 30(6):549-54.) In this project, the first draft of the whole genome was developed within 4.5 months and constructed a high density genetic linkage map within 2 months, which contains more than 700 molecular markers. Moreover, several genes were located with relations to male sterile, herbicide resistance and plant height, which could be used for breeding through transferring the targeted genes into strains and promoting the rapid, effective selection. Applying this method, the improved parental materials were obtained within 1 year, shortened the breeding period and increased the breeding efficiency significantly. 27 patents have been obtained in this project. The success of the foxtail millet project indicated that the technique of molecular breeding by whole genome-sequencing is applicable.

Progress in Super-rice

BGI and Hunan Hybrid Rice Research Center initiated a project of super-hybrid-rice together in the early 2012. This project will locate some main agronomic traits and improve the current rice directionally in short term to build a new super rice cultivar with technique of whole genome molecular breeding. 

To date, we have finished the gene location and population analysis of strains with high pest resistance or giant fringe. We finely located genes of some important agricultural traits, such as natural pest resistance, filled-grain percentage, length of rice grains, grain number per spike, number of first peduncle, number of second peduncle, setting percentage, plant height, period of sexual mature, seed holding, rice grain’s length-width ratio and so on. Moreover, we’ve explored more than 600,000 molecular markers through the whole genome of super rice. Based on these gene loci and molecular markers, we’ve built a genome-based breeding platform. We will finish the improvement of rice within 2 years, at the same time, add more other traits to cultivate new super-rice in short term. 

Progress in Potato

On July 11, 2011, the potato genome was drafted by the International Potato Genome Sequencing Consortium that constructed by 29 institutions including BGI, the Institute of Vegetables and Flowers in Chinese Academy of Agricultural Sciences and others. The result was published in Nature online. This is another breakthrough on this important tuber crop’s genomics since the framework of genome built in 2009. The genome will be valuable resource for the genetic improvement and breeding of potato.

In this research, BGI was in charge of the de novo sequencing, assembling and other bioinformatics analysis. The genome was sequenced in coverage of 123X, and the size assembled was 727Mb, which covered 86% of the whole genome. Potato is the first Asterids species to be sequenced. The group identified 39,031 protein coding genes, among which 2,642 are unique in the Asterids plants. The evolution of genome, tuber growth and resistance to pests were also characterized. BGI played an important role in this research with its powerful sequencing platform and extensive experience in bioinformatics analysis. 

Progress in goat

The result of first goat genome map, which is finished by BGI and Kunming Institute of Zoology CAS, was published on Nature Biotechnology On December 24, 2012. This research combined the technique of next-generation sequencing and whole-genome mapping, overcame the difficulty of assembly in goat genome and offered the reference genome of the first small ruminant. It will help to reveal the difference between ruminant and non-ruminant, and provide a new idea strategy for assembling large complex genomes.

Goat is one of the first domestic animals. Evidence indicates that the goat might have been domesticated from two wild Capris, and now is widely reared throughout the world, especially in China, India and other developing countries. Goats serve as an important source of meat, milk, fiber and pelts, and have also fulfilled agricultural, economic, cultural and even religious roles since very early times in human civilization.  Despite the agricultural and biological importance of goats, breeding and genetics studies have been hindered by the lack of a reference genome sequence. The de novo sequencing of goat will provide valuable resource to marker assisted breeding and improvement on the economic traits of goat. 

Progress in Oyster

The Oyster’s genome sequencing was carried out by BGI and Institute of Oceanology, CAS, and then the result was published in Nature on September 20, 2012 (The oyster genome reveals stress adaptation and complexity of shell formation. Nature, 2012, 490 (7418): 49-54). With next-generation sequencing technique and novel assembly strategy, Researchers constructed the genome map of oyster and the draft assembly provided insight into a molluscan genome characterized by high polymorphism, abundant repetitive sequences and active transposable elements. Genomic, transcriptomic and proteomic analyses showed unique adaptations of oysters to sessile life in a highly stressful intertidal environment and the complexity of shell formation.

As more and more oyster genome data was analyzed, it is hoped that the life habit of oyster could be changed to make it serves people better. For example, in the natural state, oyster will attach to the ship and the surface of some pipeline which would hinder the ship’s sailing and blocking the pipeline. Oyster’s sessile life nature makes it one of the marine fouling organisms. We may find out the key receptor of oyster’s sessile pathway, design a new drug which can change oyster to a swimming life, so the shipping hindering and pipeline blocking problem could be solved. With the results of transcriptome, miRNA, epigenetics and system biology research, researchers opened a new window for us to observe the oyster’s biological traits and its interaction with the environment, understand its characteristics of high fertility, high stress resistance and high heterozygosity, find out the omics mechanism of heterosis, species differentiation, sex determination, high mutation rate, high genetic load and sessile metamorphosis, then build an oyster genome-based breeding platform and theoretical research platform, improve the study of molluscsand marine genomics, and promote the healthy and sustainable development of molluscsculture industry.