Covering Scientific & Technical AI | Monday, December 23, 2024

Chan Zuckerberg Science to Build AI GPU Cluster to Model Cell Systems 

REDWOOD CITY, Calif., Sept. 19, 2023 -- Today, the Chan Zuckerberg Initiative announced the funding and building of one of the largest computing systems dedicated to nonprofit life science research in the world.

This new effort will provide the scientific community with access to predictive models of healthy and diseased cells, which will lead to groundbreaking new discoveries that could help cure, prevent, or manage all diseases by the end of this century. The high-performance computing cluster, which is planned to comprise 1,000+ GPUs, will enable AI and large language models for biomedicine at scale.

"AI is creating new opportunities in biomedicine, and building a high-performance computing cluster dedicated to life science research will accelerate progress on important scientific questions about how our cells work," said CZI Co-founder and Co-CEO Mark Zuckerberg. "Developing digital models capable of predicting all cell types and cell states from the genome will help researchers better understand our cells and how they behave in health and disease."

The increasing complexity, size, and accessibility of scientific datasets, as well as the rapid rise of scalable AI and machine learning methods, creates a unique opportunity to apply advances in large language models (LLMs) to biomedicine. AI systems such as AlphaFold and ESM have already made significant contributions to studying human biology. By spurring adoption across the life sciences, high-performance computing (HPC) will provide the necessary support for the ever-increasing size of LLMs through significant investments in GPUs either on premise or in the cloud. Currently, scaled and robust infrastructure is cost prohibitive for many organizations, especially academic research institutions. The CZI-funded GPU cluster will be one of the first to power openly available models of human cells to allow researchers to collaboratively accelerate their work.

"Bringing the power of generative AI to biology at scale will allow researchers to incorporate these technological advances into their work, which will accelerate efforts to cure, prevent, or manage all disease," said CZI Co-founder and Co-CEO Priscilla Chan. "AI models could predict how an immune cell responds to an infection, what happens at the cellular level when a child is born with a rare disease, or even how a patient's body will respond to a new medication. We hope that this collaborative effort will generate new insights about the fundamental characteristics of our cells."

These predictive models will be trained on datasets such as those integrated into the Chan Zuckerberg CELL by GENE (CZ CELLxGENE) software tool, which comprises the largest corpus of standardized single-cell datasets, with more than 50 million cells. Other data sources include resources generated by CZ Science research institutes, such as the protein location and interaction atlas OpenCell and the cell atlas Tabula Sapiens, built by the Chan Zuckerberg Biohub San Francisco. Large imaging datasets from the Chan Zuckerberg Institute for Advanced Biological Imaging (CZ Imaging Institute) will also be included, as well as publicly available datasets.

Watch CZI Co-founders and Co-CEOs Priscilla and Mark share CZ Science's approach to AI and the work that has been building towards this moment.

"Developing a virtual biology simulator is a natural evolution of our work in science over the past seven years," said CZI Head of Science Stephen Quake. "We have supported researchers to generate and annotate standardized, representative datasets; built tools to integrate these datasets and make them widely available — and, through our scientific institutes, we've built a new model for the kind of collaboration required to undertake this ambitious vision of building predictive cell models. CZ Science has employed many AI tools in its research for years, and this focus will unify our collective efforts to create a field-wide resource for better understanding cells and cell systems."

Current applications of AI developed by CZI's science technology team include CellGuide, a free, interactive encyclopedia — with definitions generated by ChatGPT — that quickly gives researchers key information about over 700 cell types and sub-cell types, including definitions, canonical and computational marker genes, an expandable ontology tree visualization of a cell's lineage, and relevant datasets. The CZ Imaging Institute, in partnership with CZI's science technology team, is prototyping a cloud-based, open-source CryoET Data Portal aimed at driving the development of automated annotations of cryo-ET datasets.

"CZI's science technology team brings a wealth of knowledge and experience in partnering with researchers to understand their challenges and build technology that makes new science possible," said CZI Vice President of Science Technology Patricia Brennan. "Projects like CELLxGENE have already proven to be widely useful for the field in accelerating single-cell research, with about 75% of the data originating from researchers beyond CZI who are helping us grow this data corpus. With these new AI-driven cellular models, we hope to build shared, collaborative resources that drive future breakthroughs."

CZ Science institutes bring together interdisciplinary researchers to pursue ambitious scientific challenges that couldn't be accomplished in conventional environments. In 2022, science, technology, and AI leaders launched the Kempner Institute for the Study of Natural & Artificial Intelligence at Harvard University, where researchers are studying the basis of intelligence in natural and artificial systems. CZI is supporting the CZ Biohub Network to purchase the equipment, and the CZ Biohub San Francisco has a team dedicated to HPC that supports its research and will bring its expertise to standing up this new computing system.

Read more in MIT Technology Review from CZI Co-founders and Co-CEOs Priscilla and Mark about creating AI tools to accelerate biological research.


Source: Chan Zuckerberg Initiative

AIwire