# Prediction of Protein Structures

Kathryn Tunyasuvunakool became interested in computer programming while researching roundworms. She now uses this interest to aid in protein structure prediction, she says.

A few years after Kathryn Tunyasuvunakool was born, her mother, who started college, was doing scientific research in the house where Tunyasuvunakool grew up. His mother was timing the swings of a pendulum hanging from the ceiling of their house for a scientific project one day. Another day, his mother was examining the patterns in fossil specimens found on the dinner table for a report. Tunyasuvunakool developed the perception that science is fun and that a career in research is a realistic goal due to his early exposure to science. He explains: “I always wanted to go to university and become a scientist.

Tunyasuvunakool achieved his goal by specializing in mathematics in undergraduate education and computational biology in graduate education. While working on his doctorate, he contributed to the development of a model that depicts various aspects of the growth of Caenorhabditis elegans, a popular subject of study for biologists and physicists alike. He also discovered his passion for programming, which naturally led him to pursue a career in software engineering. Tunyasuvunakool is currently a member of the AlphaFold protein structure prediction tool team at DeepMind. He interviewed Physics Magazine to learn more about the program that won two of its creators the Breakthrough Prize, and why he was so enthusiastic about the potential discoveries this program could lead.

What is AlphaFold and what applications does it have?

A machine learning model called AlphaFold can deduce the structure of a protein from its amino acid sequence. The 1D amino-acid chain of a given protein can now be quickly determined by various studies, making it relatively simple to obtain protein sequences. A protein's ability to function depends on how it folds into a three-dimensional shape that cannot be explained by its sequence. Experimentally folded structures can be produced, but this process takes time. AlphaFold accelerates knowledge about complex systems by predicting structures in a fraction of the time.

What function do you assume in the AlphaFold team?

When I first joined the team, I worked as a software engineer and created data pipelines that took already existing experimental protein structure data and turned it into features we could use to train the model. In doing so, I began to wonder how useful the predictions made by AlphaFold were. I began to carefully study the estimates and made extensive comparisons with the results in the literature. Then I started doing this job full time, evaluating model performance and exploring the usage areas of the program.

So how accurate are AlphaFold's predictions?

In 2020, I compared the structures observed in experimental studies reported in the highest-impact journals, particularly those published in Nature, with those predicted by AlphaFold. When we tried to predict single-chain protein structures at that time, AlphaFold performed quite well. However, I've noticed that most publications examine more complex systems with multiple chains rather than a single chain.

Inspired by this, we began to develop AlphaFold Multimer, a variant of the model designed for multi-chain protein complexes.

Have there been cases where experimentally derived constructs have been shown to be wrong and AlphaFold's predictions did not match?

There were a few cases but I couldn't find it. Researchers have used AlphaFold to conduct numerous studies since it became publicly available. One of the discoveries made as a result of these studies is that AlphaFold sometimes predicts more precise structures than can be discovered experimentally using nuclear magnetic resonance (NMR) methods. A significant amount of processing is required to construct a structure from experimental data in NMR. In many cases, the structure predicted by AlphaFold outperformed the structure derived from the original NMR in terms of data fit.

How many structures have been predicted by AlphaFold to date?

More than 200 million.

Have you ever worked on the architecture of a noteworthy protein?

The first sequence I worked on with the AlphaFold version used in CASP14 (the 14th iteration of the biennial evaluation of protein-structure-prediction models) was for a protein of the SARS-CoV-19 virus that causes COVID-2. It was a depressing approach to testing the system, but it was clear that individuals were curious about the protein's structure.

What is the future of AlphaFold?

I can't go into too much detail, but I can say that the AlphaFold team is committed to solving protein-related issues in the long run. There are a number of things AlphaFold still cannot model, such as the effects of ligands or water molecules on the behavior of a particular protein, or modeling non-protein components connected to the system of interest. The 3D structure of a protein is just one of many features. It would be great to be able to predict other things, including how point mutations affect a protein's form.

AlphaFold's success is actually a team effort; There are about 20 people working on upgrades. The team often works with researchers to make sure we focus on topics that interest scientists. As a result, there are always new issues to explore.

Source: physics.aps.org/articles/v15/181

Günceleme: 30/11/2022 12:17