Address:Building 11, Beishan Industrial Zone, Yantian District, Shenzhen(518083)


    News Center

News Center

Updates on BGI’s developments in research, education and industry.

首页 About News Center BGI News Community Impact Exploring the Convergence of AI and Life Sciences: A Conversation with Dr. Zhang Yong of BGI-Researc...

Exploring the Convergence of AI and Life Sciences: A Conversation with Dr. Zhang Yong of BGI-Research

April 02, 2024 Views:

In an era dominated by digits and codes, Dr. Zhang Yong, the Principal Scientist of Spatiotemporal Omics at BGI-Research, stands as a decipherer of life's complex codes, bridging the gap between information technology (IT) and biotechnology (BT) to unravel the mysteries of life through vast oceans of data. Utilizing algorithms developed by Zhang and his team, researchers can pinpoint genes linked to diseases, hereditary traits, and evolution amidst the sea of life science data.

In June 2023, Dr. Zhang's team launched the STOmics Cloud platform, followed by the release of six major algorithmic tools for spatiotemporal omics in February 2024.

Recently, we had the privilege of discussing with Dr. Zhang about the foundational tools supporting life science advancements and his team's approach to these challenges, leveraging innovative technology to propel the field forward.

Q: Can you share your research focus? What motivated you to join BGI?

Zhang: My 14-year career at BGI revolves around bioinformatics and biological big data. I joined BGI in 2010, with a background in computer science, focusing on information security. I pursued my Ph.D. degree in bioinformatics through BGI College’s joint training program.

My decision to join BGI was influenced by personal interest and the advancement of life science technology. I've always been fascinated by biology and was intrigued by BGI when this institute visited my school for recruitment. Back then, I had only heard about "gene sequencing" from textbooks, and it seemed like an exciting new field. Moreover, 2010 was a time when high-throughput sequencing technology was rapidly advancing.

Q: The term "bioinformatics" might seem mysterious to the public. Could you give a simple introduction, and what exactly do you focus on?

Zhang: Deciphering life involves "reading," "writing," and "storing." Our research is focused on "reading," which itself is divided into two parts: converting biological samples into data, then data into knowledge or applications.

I am primarily responsible for turning data into applications, extracting valuable information from the DNA nucleotide codes for analysis.

Most researchers lack a background in algorithms, so it is up to those who understand algorithms to develop them. We are “those” responsible for developing the algorithmic tools.

Researchers then use these tools to analyze data, interpret results with their biological knowledge, and uncover mysteries.

Q: What has been the most challenging project in your career?

Zhang: For me, the STOmics Cloud platform has been the most challenging project so far. It demands systematic team management, effective communication, and handling of technical challenges.

How do we create an excellent bioinformatics cloud platform? How can non-professionals, like doctors, use our platform? This requires an understanding of the needs of different user groups in different scenarios, combined with an understanding of bioinformatics analysis, which poses great challenges both technically and in terms of product development.

WechatIMG4743.jpgDr. Zhang Yong, the Principal Scientist of Spatiotemporal Omics at BGI-Research, discussed with his colleague.

Q: Could you introduce the STOmics Cloud platform and how it helps scientists analyze and understand spatiotemporal genomics data?

Zhang: The STOmics Cloud platform is an analytical platform featuring three main functions.

First is project and data management. In the past, analysts had to manage projects in a command-line interface on a black screen, issuing commands through code, which was impossible for those who did not know coding. Now, analysts can simply click through a webpage to create their own projects and manage their data within those projects. This has solved a fundamental problem.

Secondly, we built multiple analysis modules on the platform, termed as 3+1 modules. These include workflow analysis for standardized batch analysis; interactive tools for visualization and interactive exploration; and personalized analysis for various research discoveries. We also provide an analysis “App Store,” where users can grab desired analysis modules, including many BGI-made bioinformatics tools, workflows, and needed datasets.

Lastly, we've introduced an intelligent assistance system that helps with knowledge Q&A, biological interpretation, literature review, and document drafting, which further lowers the barriers to data analysis.

Q: Today, big biological data presents unprecedented opportunities and immense challenges. What contributions does the STOmics Cloud platform make in this regard?

Zhang: The reason we named it the STOmics Cloud platform is that it was initially designed to serve spatiotemporal omics (STOmics) projects, which face issues of large data volumes, high data dimensions, and complex data structures. Traditional processing methods are inadequate for such tasks.

For instance, BGI's largest spatiotemporal chip measures 13cm by 13cm with 16.9 billion capture points, translating to 10TB or more of data per sample, a stark contrast to traditional genomic datasets of approximately 100GB.

The platform aims to overcome the computational and analytical challenges presented by such vast data.

Q: In February this year, BGI-Research published a series of spatiotemporal omics algorithm tools as a special issue in GigaScience and GigaByte. Could you briefly introduce this achievement?

Zhang: As I mentioned earlier, spatiotemporal omics involve large and complex data, with an additional spatial dimension compared to traditional single-cell data. Therefore, when analyzing data, we must develop new algorithms and tools. 

This special issue includes these new tools, which can effectively process high-dimensional and complexly structured spatiotemporal omics data, thereby helping researchers to deeply understand the structure and function of biological systems.

Q: What role do you think technical tools play in the development of life sciences?

Zhang: Tools are playing a decisive role. First of all, as a natural science, the core of life sciences is to observe first and then proceed to scientific understanding and breakthroughs. Secondly, only when the cost of obtaining data is low enough can the data be widely and universally applied. This is the irreplaceable role of technical tools in the development of life sciences.

Q: What are your expectations for the future?

Zhang: I believe there is vast development space in the BIT (bioinformatics) field, with many opportunities to make industrial or scientific contributions. Therefore, I hope to continue to work in this field, further solve problems related to biological big data, create more and better algorithms, tools, and systems, and help the scientist to overcoming the big data challenges.