Using an Ensemble to Identify and Classify Macroalgae Antimicrobial Peptides
Autor: | Joan O’Keeffe, John Healy, Orla Slattery, Michela Caprani |
---|---|
Rok vydání: | 2021 |
Předmět: |
0303 health sciences
Phylum First line 030302 biochemistry & molecular biology Antimicrobial peptides Health Informatics Computational biology Biology General Biochemistry Genetics and Molecular Biology Computer Science Applications 03 medical and health sciences Broad spectrum ComputingMethodologies_PATTERNRECOGNITION Protein sequencing Data sequences Amino acid composition Function (biology) 030304 developmental biology |
Zdroj: | Interdisciplinary Sciences: Computational Life Sciences. 13:321-333 |
ISSN: | 1867-1462 1913-2751 |
Popis: | The rapid spread of multi-drug resistant microbes has lead researchers to discover natural alternative remedies such as antimicrobial peptides (AMPs). In the first line of defense, AMPs display a broad spectrum of potent activity against multi-resistant pathogenic bacteria, viruses, fungi, and even cancer. AMPs can be further characterised into families according to amino acid composition, secondary structure, and function. However, despite recent advancements in rapid computational methods for AMP prediction from various mammalian, aquatic, and terrestrial species, there is limited information regarding their presence, functional roles, and family type from marine macroalgae. In this paper, we present a promising two-tier ensemble of heterogeneous machine learning models that integrates seven well-known machine learning classifiers to predict AMPs from macroalgae. The first tier of the ensemble consists of a suite of binary classifiers that identify AMPs from protein sequence data which are then forwarded to a second-tier multi-class ensemble to characterise their functional family type. The two-tier ensemble was successfully used to identify 39 putative AMP sequences in 12 macroalgae species from three different phyla groups. The approach we describe is not limited to AMPs and can also be applied to search sequence data for other types of proteins. |
Databáze: | OpenAIRE |
Externí odkaz: |