Automatic Extraction of Highly Predictive Sequence Features that Incorporate Contiguity and Mutation

Autor: Carolina Ruiz, Hao Wan, Joseph E. Beck
Rok vydání: 2015
Zdroj: Biomedical Engineering Systems and Technologies ISBN: 9783319261287
BIOSTEC (Selected Papers)
Popis: This paper investigates the problem of extracting sequence features that can be useful in the construction of prediction models. The method introduced in this paper generates such features by considering contiguous subsequences and their mutations, and by selecting those candidate features that have a strong association with the classification target according to the Gini index. Experimental results on three genetic data sets provide evidence of the superiority of this method over other sequence feature generation methods from the li-terature, especially in domains where presence, not specific location, of features within a sequence is pertinent for classification.
Databáze: OpenAIRE