Autor: |
Salvatore Cosentino, Sira Sriswasdi, Wataru Iwasaki |
Jazyk: |
angličtina |
Rok vydání: |
2024 |
Předmět: |
|
Zdroj: |
Genome Biology, Vol 25, Iss 1, Pp 1-18 (2024) |
Druh dokumentu: |
article |
ISSN: |
1474-760X |
DOI: |
10.1186/s13059-024-03298-4 |
Popis: |
Abstract Accurate inference of orthologous genes constitutes a prerequisite for comparative and evolutionary genomics. SonicParanoid is one of the fastest tools for orthology inference; however, its scalability and accuracy have been hampered by time-consuming all-versus-all alignments and the existence of proteins with complex domain architectures. Here, we present a substantial update of SonicParanoid, where a gradient boosting predictor halves the execution time and a language model doubles the recall. Application to empirical large-scale and standardized benchmark datasets shows that SonicParanoid2 is much faster than comparable methods and also the most accurate. SonicParanoid2 is available at https://gitlab.com/salvo981/sonicparanoid2 and https://zenodo.org/doi/10.5281/zenodo.11371108 . |
Databáze: |
Directory of Open Access Journals |
Externí odkaz: |
|