David Baker: How to win a Nobel Prize

Using Google's Deepmind AI to design countless useful proteins...

10 December 2024

Interview with

David Baker

Part of the show Titans of Science: David Baker

DAVID BAKER NOBEL PRIZE.jpg

Credit:

Ian C Haydon

Play Download

In this edition of Titans of Science, Chris Smith chats to David Baker, the Nobel Prize winner who used AI to design custom proteins...

Chris - This was about the turn of the millennium, early 2000s, wasn't it, that you were doing this? So that got you a bit further along. But there was still clearly a gap because this didn't immediately revolutionise our ability to predict proteins, or you would've won the Nobel Prize a lot longer ago than you have. So what was still the stumbling block at that stage then?

David - Well, the stumbling block for both structure prediction and design were just the ones that I described, that proteins are very complicated and they're made out of many thousands of atoms. So really doing accurate calculations, it was really hard to get really accurate structure predictions, for example. On the design side, we were able to design more and more powerful proteins doing a wider and wider range of jobs, but we had to try a lot of different designs to find one that really worked well and solve the problem that we intended it to solve. So the real game changer was the advent of deep learning, and that was really demonstrated in a spectacular fashion by the DeepMind team, my co-laureates John Jumper and Demis Hassabis, who showed that the database of protein structures was sufficiently large that one could learn from it the rules of protein folding and go from an amino acid sequence directly to a three dimensional structure. So I have to tell you one thing though, just to put this in context. Before it was possible to predict the protein structure, the structure of a protein from its amino acid sequence, scientists around the world spent many, many years and actually still do determining the structures of proteins experimentally. That means figuring out where in space each atom of a protein is. And they do this in a number of ways. For example, one of the most powerful is shining x-rays at a crystal of the protein and figuring out how those x-rays scatter. And that gives you direct information on the position of atoms. Now, tens of thousands of scientists over 50 years at an expense of tens of billions of dollars or more, spent their careers determining the structures of proteins. And many scientists, great scientists, are continuing to solve the structures of more and more complex proteins. And so what this led to was a database of about 200,000 different protein structures, and each protein structure specifies exactly where each atom in that protein is relative to the others. So it's this incredibly rich storehouse of information. And what the DeepMind group showed is that this information store was sufficiently detailed and rich, that you could really learn the rules and predict structures of proteins from their sequence.

Chris - You feed in to the artificial intelligence all of that wealth of information where people have painstakingly worked out where the atoms are in three dimensional space in each of those proteins. So it can then learn. And that presumably means you can then feed it an unknown protein, an amino acid sequence. These are the building blocks of a protein you've never seen before. And it can apply the same rules to then work out what it would look like.

David - That's exactly right. So the program that the DeepMind group developed is called AlphaFold. AlphaFold was trained on all the amino acid sequences of proteins of known structure. It was trained to predict the structure. And so now you can give a new amino acid sequence to AlphaFold, and it will generate the predicted structure for it.

Chris - One of the things that the award committee said was that you achieved the almost impossible feat of making new proteins. So this was essentially upstream of what we've just said. You proved that you could make a new protein from scratch, you could come up with a concept and design it. And I suppose what the DeepMind team then did was to equip you with a way of doing that far faster.

David - Well, yes. So as I described, when we started designing proteins long before deep learning was even a well established field. And we used this sort of atomic description that I described earlier where we had to model all the interactions between pairs of atoms, and we used that approach to design completely new proteins. And that was what was cited by the Nobel Committee. That was back in 2003. After DeepMind showed that protein structure prediction could be greatly enhanced using deep learning. We naturally were very quickly moved to apply deep learning to protein design. And what we found is that we were able to develop very powerful methods for designing brand new proteins that were much better than the previous sort of methods based on this cloud of atoms I described earlier. And using these new design methods, we can design proteins that have a very wide range of different functions, and we have made these methods freely available to anyone in the world. And so it's very exciting now because we're seeing many different research groups designing new proteins using the deep learning methods we've developed. 10, 15 years ago the idea of trying to solve a problem in biotechnology or in sustainability with a design protein just sounded totally crazy on the lunatic fringe. But now there's really great interest in designing new proteins to solve problems in medicine, in sustainability and technology. So it's a very exciting time.

Chris - Could you, for example, to think about how we might deploy something like this. Could you say, 'well look, ocean and marine plastic pollution, that's a major headache. I want to design an enzyme that has never existed in nature. It's a protein that can attack plastic in the ocean and get rid of it.' Could we throw that sort of problem at this sort of solution now and begin to build protein machines that would do that sort of job for us?

David - That is exactly the type of problem that we're working on now. So there are several extremely talented researchers in my group who are working specifically on that to design catalysts that will break down plastic. We're also working on ways, new ways, to fix CO2 as well as new proteins that will very specifically target cancer cells in the body. So you can treat the cancer without systemic effects. It's an exciting time also because we have our first medicines that have been approved for use in humans. And that's a vaccine, a Covid vaccine developed by my colleague Neil King at the Institute for Protein Design here.