Using Auto-Encoders to Create Encodings for Three-Dimensional Protein Structure Information.

Autor: Kohut, Angela M., Kremer, Stefan C., Graether, Steffen P.
Předmět:
Zdroj: Procedia Computer Science; 2024, Vol. 246, p1538-1547, 10p
Abstrakt: Proteins play a crucial role in various biological processes, serving as the building blocks and machines of life. Therefore, understanding their structure and function is paramount for advancing our knowledge of structural biology. The Protein Data Bank (PDB) [3] files have been an integral part of helping researchers decipher the complex workings of proteins. PDB files provide three-dimensional Cartesian coordinates of protein structures which are used as a stepping stone for other protein structure tools, such as protein classification. Efficient protein classification is vital for organizing and categorizing the large number of proteins discovered to date. It enables researchers to identify functional relationships, predict protein functions, and gain insights into their evolutionary history. However, current protein structural classification systems like CATH [22], SCOP [2] and SCOPe [5] have some limitations, such as complicated protein structure domain descriptions, manual and subjective classification, lack of customizability for users with different classification needs, and handling the increasing volume of protein structure data. Recently, image processing has advanced significantly, mainly due to neural networks such as Convolutional Neural Networks (CNNs) and auto-encoders. This work aims to harness the remarkable success of learned representations and CNNs for image processing by proposing a foundation model for the development of new encodings from three-dimensional protein structure information for various classification needs. The protein encodings will be helpful in other protein structure-related problems such as protein structure prediction, protein function prediction, and drug discovery. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index