A protein is made from a ribbon of amino acids that folds itself up with many complex twists and turns and tangles. This structure determines what it does. And figuring out what proteins do is key to understanding the basic mechanisms of life, when it works and when it doesn't. Efforts to develop vaccines for covid-19 have focused on the virus's spike protein, for example. The way the coronavirus snags onto human cells depends on the shape of this protein and the shapes of the proteins on the outsides of those cells. The spike is just one protein among billions across all living things; there are tens of thousands of different types of protein inside the human body alone.
In this year's CASP, AlphaFold predicted the structure of dozens of proteins with a margin of error of just 1.6 angstroms -- that's 0.16 nanometers, or atom-sized. This far outstrips all other computational methods and for the first time matches the accuracy of techniques used in the lab, such as cryo-electron microscopy, nuclear magnetic resonance and x-ray crystallography. These techniques are expensive and slow: it can take hundreds of thousands of dollars and years of trial and error for each protein. AlphaFold can find a protein's shape in a few days.
The breakthrough could help researchers design new drugs and understand diseases. In the longer term, predicting protein structure will also help design synthetic proteins, such as enzymes that digest waste or produce biofuels. Researchers are also exploring ways to introduce synthetic proteins that will increase crop yields and make plants more nutritious.
"It's a very substantial advance," says Mohammed AlQuraishi, a systems biologist at Columbia University who has developed his own software for predicting protein structure. "It's something I simply didn't expect to happen nearly this rapidly. It's shocking, in a way."
"This really is a big deal," says David Baker, head of the Institute for Protein Design at the University of Washington and leader of the team behind Rosetta, a family of protein analysis tools. "It's an amazing achievement, like what they did with Go."
Identifying a protein's structure is very hard. For most proteins, researchers have the sequence of amino acids in the ribbon but not the contorted shape they fold into. And there are typically an astronomical number of possible shapes for each sequence. Researchers have been wrestling with the problem at least since the 1970s, when Christian Anfinsen won the Nobel prize for showing that sequences determined structure.
The launch of CASP in 1994 gave the field a boost. Every two years, the organizers release 100 or so amino acid sequences for proteins whose shapes have been identified in the lab but not yet made public. Dozens of teams from around the world then compete to find the correct way to fold them up using software. Many of the tools developed for CASP are already used by medical researchers. But progress was slow, with two decades of incremental advances failing to produce a shortcut to painstaking lab work.
CASP got the jolt it was looking for when DeepMind entered the competition in 2018 with its first version of AlphaFold. It still could not match the accuracy of a lab but it left other computational techniques in the dust. Researchers took note: soon many were adapting their own systems to work more like AlphaFold.