Pushmeet Kohli Discusses Future Challenges in DNA Semantics at Google DeepMind
Artificial intelligence (AI) has become an essential tool in scientific advancement. Pushmeet Kohli, Vice President of Science at Google DeepMind, emphasizes the complexity of human biology, likening it to the most intricate program ever written. His insights come in the wake of significant achievements by his colleagues, Demis Hassabis and John Jumper, who were awarded the Nobel Prize in Chemistry for their contributions to protein structure prediction utilizing AI. Their groundbreaking tool, AlphaFold2, has successfully mapped the three-dimensional structures of over 200 million proteins, providing crucial information for understanding biological functions.
Kohli leads a team of approximately 150 researchers at DeepMind, which operates independently of Google's commercial objectives, focusing solely on scientific exploration. Originally from Dehradun, India, Kohli moved to the UK for his education and earned his doctorate from the University of Cambridge. He previously held a research director position at Microsoft before joining DeepMind in 2017 to oversee scientific projects.
In a recent discussion at the AI for Science forum in London, Kohli outlined the transformative impact of AI across various scientific disciplines. He noted that if a scientific inquiry can be framed as a reasoning or pattern recognition problem, AI can contribute significantly. However, he cautioned against the common misconception that AI can operate effectively without a proper understanding of the physical data being studied.
Regarding ongoing projects, Kohli highlighted a keen interest in genomics, particularly in understanding the semantics of DNA. He explained that their goal is to decipher the implications of genetic mutations and address the challenges posed by variants of unknown significance. Additionally, the team is exploring new materials, nuclear fusion, climate studies, and foundational research in mathematics and computer science.
In the realm of nuclear fusion, Kohli's team aims to enhance plasma stability during reactor operation. The AI manages the magnetic field intricately to prevent disruptions that could destabilize the plasma. When discussing new materials, the objective is to identify and develop substances that are both synthesizable and stable under laboratory conditions.
Kohli elaborated on their genomic research, indicating that while they have made reliable predictions regarding the protein-coding regions of the human genome, the understanding of non-coding regions remains an open field of inquiry. The human genome project successfully sequenced the three billion base pairs that constitute human DNA, revealing that these sequences hold significant meanings yet to be fully understood.
He acknowledged the evolving landscape of AI, particularly the rise of generative AI models like Gemini at Google. These advancements have enabled researchers to extract knowledge from scientific literature, facilitating new discoveries. Kohli expressed optimism about the potential of generative AI to enhance scientific research, as it allows for a broader interpretation of existing data.
Kohli also addressed the discourse surrounding synthetic data and its role in training AI models. He stated that while larger models can offer greater expressiveness, the diversity of the training data is crucial. Synthetic data can be useful, but it is not universally applicable. The team primarily relies on experimental data, simulations, and cautiously incorporates synthetic data, ensuring that the foundational models are robust.
In conclusion, Kohli affirmed that while the integration of data from various sources is vital, understanding when and how synthetic data can be effective remains a key area for future exploration. The overarching goal is to improve the performance of AI systems across diverse scientific challenges.