Abstract

Proteins are on the top of the list of the most important biomolecules that contribute to virtually every biochemical activity in the human body and other biological species. Knowledge of the 3D structure of proteins was fundamental for their function and scientists worked very hard for decades to reveal their 3D structure.  The difficulties to predict the accurate 3D structure of proteins with the conventional methods (X-ray, NMR and Cryo-electron microscopy) encouraged some scientists to support the idea of resolving the problem by computational methods. The Artificial Intelligence AlphaFold 2, AI-based program developed by Google’s DeepMind to crack the problem of predicting protein structures from amino acid sequence, made a strike in late 2020 when it “won” the 14th edition of CASP (Critical Assessment of Structure Prediction).  In 2022 DeepMind and the European Molecular Biology Laboratory’s and the European Bioinformatics Institute (EMBL-EBI) have partnered to create AlphaFold (Protein Structure) DataBase to make these predictions freely available to the scientific community. The latest database (2022) contains over 200 million entries, providing broad coverage of UniProt (the standard repository of protein sequences and annotations). How a protein folds, functions and interacts with other molecules, remained one of the most critical problems in bioinformatics. These exciting results open up the potential for biologists to use computational structure prediction as a core tool in scientific research. AI methods may prove especially helpful for important classes of proteins, such as membrane proteins, that are very difficult to crystallize, important for typical experimental determinations.

In 2022 researchers (Wang et al., Science 2022) described the Deep Learning approaches for protein scaffolding such as functional sites without needing to prespecify the fold or secondary structure of the scaffold. The first approach, “constrained hallucination,” optimizes sequences such that their predicted structures contain the desired functional site. The second approach, “inpainting,” starts from the functional site and fills in additional sequence and structure to create a viable protein scaffold in a single forward pass through a specifically trained RoseTTAFold network. RoseTTAFold is a software tool that uses deep learning to quickly and accurately predict protein structures based on limited information.

At present, Artificial Intelligence has become a horizontal infrastructure, supporting the development for a variety of natural science disciplines and technological applications. The use of neural networks and AI software in natural science setting promoted the cross-fertilization between disciplines, such as the advancement of mathematical modeling, contemporary disciplines of Physics, specialised Chemistry fields, Materials science, Engineering, Biology and Biomedical Engineering.

http://chem-tox-ecotox.org/wp-content/uploads/2022/09/PROTEIN-3D-ARTIFICIAL-INTELLIGENCE-2022.pdf