Machine learning application identifies novel gene signatures from transcriptomic data of spontaneous canine hemangiosarcoma
Autor: | Ashley J. Schulte, Fadil Santosa, Nuojin Cheng, Jong Hyuk Kim |
---|---|
Rok vydání: | 2020 |
Předmět: |
Hemangiosarcoma
Feature selection Biology Machine learning computer.software_genre Malignancy Machine Learning 03 medical and health sciences Dogs 0302 clinical medicine medicine Animals Angiosarcoma Dog Diseases Molecular Biology 030304 developmental biology 0303 health sciences business.industry Gene Expression Profiling Cancer Gold standard (test) medicine.disease Canine Hemangiosarcoma Random forest Gene Expression Regulation Neoplastic 030220 oncology & carcinogenesis Artificial intelligence Databases Nucleic Acid Transcriptome business computer Information Systems |
Zdroj: | Briefings in Bioinformatics. 22 |
ISSN: | 1477-4054 1467-5463 |
DOI: | 10.1093/bib/bbaa252 |
Popis: | Angiosarcomas are soft-tissue sarcomas that form malignant vascular tissues. Angiosarcomas are very rare, and due to their aggressive behavior and high metastatic propensity, they have poor clinical outcomes. Hemangiosarcomas commonly occur in domestic dogs, and share pathological and clinical features with human angiosarcomas. Typical pathognomonic features of this tumor are irregular vascular channels that are filled with blood and are lined by a mixture of malignant and nonmalignant endothelial cells. The current gold standard is the histological diagnosis of angiosarcoma; however, microscopic evaluation may be complicated, particularly when tumor cells are undetectable due to the presence of excessive amounts of nontumor cells or when tissue specimens have insufficient tumor content. In this study, we implemented machine learning applications from next-generation transcriptomic data of canine hemangiosarcoma tumor samples (n = 76) and nonmalignant tissues (n = 10) to evaluate their training performance for diagnostic utility. The 10-fold cross-validation test and multiple feature selection methods were applied. We found that extra trees and random forest learning models were the best classifiers for hemangiosarcoma in our testing datasets. We also identified novel gene signatures using the mutual information and Monte Carlo feature selection method. The extra trees model revealed high classification accuracy for hemangiosarcoma in validation sets. We demonstrate that high-throughput sequencing data of canine hemangiosarcoma are trainable for machine learning applications. Furthermore, our approach enables us to identify novel gene signatures as reliable determinants of hemangiosarcoma, providing significant insights into the development of potential applications for this vascular malignancy. |
Databáze: | OpenAIRE |
Externí odkaz: |