Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores

Autor:	Darren R. Flower, Jesper Salomon
Jazyk:	angličtina
Rok vydání:	2016
Předmět:	Protein Conformation Sequence analysis Sequence alignment Peptide binding Computational biology Biology lcsh:Computer applications to medicine. Medical informatics Major histocompatibility complex Biochemistry Sequence Analysis Protein Structural Biology Databases Genetic Humans lcsh:QH301-705.5 Molecular Biology Bioinformatics (life sciences) HLA-DR Antigen Genetics MHC class II Binding Sites HLA-A Antigens Sequence Homology Amino Acid Applied Mathematics Histocompatibility Antigens Class II Computational Biology Reproducibility of Results HLA-DR Antigens Computer Science Applications Kernel method ROC Curve lcsh:Biology (General) Kernel (statistics) biology.protein lcsh:R858-859.7 Peptides Sequence Alignment Epitope Mapping Research Article HLA-DRB1 Chains Protein Binding
Zdroj:	BMC Bioinformatics, Vol 7, Iss 1, p 501 (2006) BMC Bioinformatics
ISSN:	1471-2407
Popis:	Background Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel. Results The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP 1, MCHBN 2, and MHCBench 3. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 AROC for the MHCBench data sets (up from 0.756), and an average of 0.96 AROC for multiple alleles of the MHCPEP database. Conclusion The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::df6b44ef89eb21434de812c695dd5fc9 https://ora.ox.ac.uk/objects/uuid:07743f9d-29f4-49b3-8f2b-2932ca636aa7 Zobrazit plný text záznamu Full text from SpringerLink