A Novel Ensemble Stacking Classification of Genetic Variations Using Machine Learning Algorithms

Autor: Yeturu Jahnavi, Poongothai Elango, S. P. Raja, P. Nagendra Kumar
Rok vydání: 2021
Předmět:
Zdroj: International Journal of Image and Graphics. 23
ISSN: 1793-6756
0219-4678
Popis: Genetics is the clinical review of congenital mutation, where the principal advantage of analyzing genetic mutation of humans is the exploration, analysis, interpretation and description of the genetic transmitted and inherited effect of several diseases such as cancer, diabetes and heart diseases. Cancer is the most troublesome and disordered affliction as the proportion of cancer sufferers is growing massively. Identification and discrimination of the mutations that impart to the enlargement of tumor from the unbiased mutations is difficult, as majority tumors of cancer are able to exercise genetic mutations. The genetic mutations are systematized and categorized to sort the cancer by way of medical observations and considering clinical studies. At the present time, genetic mutations are being annotated and these interpretations are being accomplished either manually or using the existing primary algorithms. Evaluation and classification of each and every individual genetic mutation was basically predicated on evidence from documented content built on medical literature. Consequently, as a means to build genetic mutations, basically, depending on the clinical evidences persists a challenging task. There exist various algorithms such as one hot encoding technique is used to derive features from genes and their variations, TF-IDF is used to extract features from the clinical text data. In order to increase the accuracy of the classification, machine learning algorithms such as support vector machine, logistic regression, Naive Bayes, etc., are experimented. A stacking model classifier has been developed to increase the accuracy. The proposed stacking model classifier has obtained the log loss 0.8436 and 0.8572 for cross-validation data set and test data set, respectively. By the experimentation, it has been proved that the proposed stacking model classifier outperforms the existing algorithms in terms of log loss. Basically, minimum log loss refers to the efficient model. Here the log loss has been reduced to less than 1 by using the proposed stacking model classifier. The performance of these algorithms can be gauged on the basis of the various measures like multi-class log loss.
Databáze: OpenAIRE