Neural-based approaches to overcome feature selection and applicability domain in drug-related property prediction
Autor: | Ignacio Ponzoni, María Virginia Sabando, Axel J. Soto |
---|---|
Rok vydání: | 2019 |
Předmět: |
FEATURE SELECTION
0209 industrial biotechnology Quantitative structure–activity relationship Computer science Feature selection MODEL INTERPRETABILITY 02 engineering and technology NEURAL NETWORKS Machine learning computer.software_genre 020901 industrial engineering & automation 0202 electrical engineering electronic engineering information engineering Pharmaceutical sciences Otras Ciencias de la Computación e Información Selection (genetic algorithm) Interpretability Artificial neural network business.industry Drug discovery Deep learning Biological activity Chemical space QSAR MODELING APPLICABILITY DOMAIN Ciencias de la Computación e Información 020201 artificial intelligence & image processing Artificial intelligence business computer CIENCIAS NATURALES Y EXACTAS Software Applicability domain |
Zdroj: | Applied Soft Computing. 85:105777 |
ISSN: | 1568-4946 |
Popis: | In the fields of pharmaceutical research and biomedical sciences, QSAR modeling is an established approach during drug discovery for prediction of biological activity of drug candidates. Yet, QSAR modeling poses a series of open challenges. First, chemical compounds are represented on a high-dimensional space and thus feature selection is typically applied, although this task entails a challenging combinatorial problem with potential loss of information. Second, the definition of the applicability domain of a QSAR model is a desirable aspect to determine the reliability of predictions on unseen chemicals, which is often difficult to assess due to the extent of the chemical space. Finally, interpretability of these models is also a critical issue for drug designers. The purpose of this work is to thoroughly assess the application of neural-based methods and recent advances deep learning for QSAR modeling. We hypothesize that neural-based methods can overcome the need to perform a descriptor selection phase. We developed three QSAR models based on neural networks for prediction of relevant chemical and biomedical properties that, in the absence of any feature selection step, can outperform the state-of-the-art models for such properties. We also implemented an embedded applicability domain technique based on network output probabilities that proved to be effective; its application improved the predictive performance of the model. Finally, we proposed the use of a post hoc feature analysis technique based on an aggregation of network weights, which enabled effective detection of relevant features in the model. Fil: Sabando, María Virginia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina Fil: Ponzoni, Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina Fil: Soto, Axel Juan. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina |
Databáze: | OpenAIRE |
Externí odkaz: |