Single Nucleotide Polymorphism and Type 2 Diabetes Mellitus Phenotypes Association Using Gradient Boosting

Autor: Rudi Heryanto, Lailan Sahrina Hasibuan, Wisnu Ananta Kusuma, Aulia Fadli
Rok vydání: 2020
Předmět:
Zdroj: 2020 International Conference on Advanced Computer Science and Information Systems (ICACSIS).
DOI: 10.1109/icacsis51025.2020.9263142
Popis: Precision medicine is a medical field that aims to provide disease treatment according to an individual’s genetic information, environment and lifestyle. Recently, precision medicine researchs is focusing on studying complex diseases, such as diabetes mellitus (DM). The most common form of DM is type 2 DM (T2DM). T2DM patient’s genetic information can be obtained by finding an association between Single Nucleotide Polymorphism (SNP) and phenotypes of T2DM. This research aims to find SNPs which considered related to T2DM phenotypes using gradient boosting (GB) algorithm. Data were taken from Mouse Phenome Database website based on 98 protein candidates of T2DM. Preprocessing stage is conducted by deleting unused features and missing values, SNP encoding, and merging phenotypes and SNP data. Model was built using GB with decision tree base-learners and least square loss function. GB produced an average MSE value of 0.061 and MAE value of 0.171 and also obtained 30 SNPs that potentially associated with T2DM’s insulin tolerance phenotype. Twenty two of 30 choosen SNPs verified to have association with T2DM phenotypes on Mouse Genome Informatics website based on SNP-protein-phenotype relationship.
Databáze: OpenAIRE