Neural-based approaches to overcome feature selection and applicability domain in drug-related property prediction

Autor: Ignacio Ponzoni, María Virginia Sabando, Axel J. Soto
Rok vydání: 2019
Předmět:
FEATURE SELECTION
0209 industrial biotechnology
Quantitative structure–activity relationship
Computer science
Feature selection
MODEL INTERPRETABILITY
02 engineering and technology
NEURAL NETWORKS
Machine learning
computer.software_genre
020901 industrial engineering & automation
0202 electrical engineering
electronic engineering
information engineering

Pharmaceutical sciences
Otras Ciencias de la Computación e Información
Selection (genetic algorithm)
Interpretability
Artificial neural network
business.industry
Drug discovery
Deep learning
Biological activity
Chemical space
QSAR MODELING
APPLICABILITY DOMAIN
Ciencias de la Computación e Información
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
CIENCIAS NATURALES Y EXACTAS
Software
Applicability domain
Zdroj: Applied Soft Computing. 85:105777
ISSN: 1568-4946
Popis: In the fields of pharmaceutical research and biomedical sciences, QSAR modeling is an established approach during drug discovery for prediction of biological activity of drug candidates. Yet, QSAR modeling poses a series of open challenges. First, chemical compounds are represented on a high-dimensional space and thus feature selection is typically applied, although this task entails a challenging combinatorial problem with potential loss of information. Second, the definition of the applicability domain of a QSAR model is a desirable aspect to determine the reliability of predictions on unseen chemicals, which is often difficult to assess due to the extent of the chemical space. Finally, interpretability of these models is also a critical issue for drug designers. The purpose of this work is to thoroughly assess the application of neural-based methods and recent advances deep learning for QSAR modeling. We hypothesize that neural-based methods can overcome the need to perform a descriptor selection phase. We developed three QSAR models based on neural networks for prediction of relevant chemical and biomedical properties that, in the absence of any feature selection step, can outperform the state-of-the-art models for such properties. We also implemented an embedded applicability domain technique based on network output probabilities that proved to be effective; its application improved the predictive performance of the model. Finally, we proposed the use of a post hoc feature analysis technique based on an aggregation of network weights, which enabled effective detection of relevant features in the model. Fil: Sabando, María Virginia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina Fil: Ponzoni, Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina Fil: Soto, Axel Juan. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Databáze: OpenAIRE