Popis: |
Single amino acid variation (SAV) is an amino acid substitution of the protein sequence and might influence the whole protein structure, binding affinity, or functional domain and related to disease, even cancer. However, to clarify the relationship between SAV and cancer using traditional experiments is time and resource consuming. Though there are some SAVs predicted methods using the computational approach, most of them predict the protein stability changed caused by SAV. In this work, all of the SAV characteristics generated from protein sequences, structures, and micro-environment would be converted into feature vectors and fed into an integrated predicting system by using Support Vector Machine and genetic algorithm. The critical features were used to estimate the relationship between their properties and cancer caused by SAVs. In the results, we have developed a prediction system based on protein sequence and structure, which could distinguish the SAV is related to cancer or not, and the accuracy, the Matthews correlation coefficient, and the F1-score yield to 90.88%, 0.77 and 0.83, respectively. Moreover, an online prediction server called CanSavPre was built (http://bioinfo.cmu.edu.tw/CanSavPre/), which will be a useful, practical tool for cancer research and precision medicine. |