Improving the Annotations of JCVI-Syn3a Proteins.

Autor: Kilinc M; Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, USA., Jia K; Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT, USA., Jernigan RL; Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, USA. jernigan@iastate.edu.; Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, IA, USA. jernigan@iastate.edu.
Jazyk: angličtina
Zdroj: Methods in molecular biology (Clifton, N.J.) [Methods Mol Biol] 2025; Vol. 2867, pp. 153-168.
DOI: 10.1007/978-1-0716-4196-5_9
Abstrakt: The JCVI-Syn3 organism is a minimal organism derived from Mycoplasma mycoides capri, which is capable of self-replication. While the ancestor has 863 genes, the synthetic progeny has only 473, with 434 of these coding for proteins. Despite initial efforts to understand all functions of the organism, a significant number of these protein-coding genes still have unknown functions, and subsequent studies have been only partially successful in elucidating their roles. In this study, we employ our innovative method PROST to identify homologs and better understand these previously unidentified genes. PROST employs protein language embeddings and enables the identification of remote homologs with as low as 16% sequence identity. PROST successfully finds functionally annotated homologs for 93% of the minimal genome with a high level of accuracy, both confirming previously identified functions, as well as proposing new functions for others. The results of our study can be accessed at https://bit.ly/prost-syn3a .
(© 2025. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.)
Databáze: MEDLINE