Popis: |
Owing to the success achieved by deep learning, researchers are exploringthe application of deep learning in drug discovery to improve the accuracy of prediction models. Significant performance improvement has been achieved by diverse convolutional neural network (CNN) models in computer vision, and the preparation of an input format suitable for CNN is one of the major questions required to be answered in order to harness the advancements in using CNNs for chemical data. It was reported that the models achieved improvement in prediction accuracy, in deep learning studies on molecular structure data; however, the improvement was insufficient from an industry perspective. Furthermore, a recent study suggested that conventional machine learning models can outperform deep learning models on chemical data. As only a limited number of feature calculation methods are available for molecules in deep learning studies, it is crucial to develop more methods to calculate features appropriate for deep learning model development.A topological distance-based electron interaction (TDEi) tensor has been introduced in this study to transform a molecular structure into image-like 3D arrays based on electron interactions (Eis) within a molecule. The prediction accuracy of the CNN model with the TDEi tensor was tested with four datasets: MP (275,131), Lipop (4,193), Esol (1,127), and Freesolv (639), and the models achieved desirable prediction accuracy. Ei is the fundamental level of information that determines the chemical properties of a molecule. Feature space variation was visualized by taking outputs from the middle of the CNN architecture as the CNN model exhibited outstanding performance in automatic feature extraction.The correlation between features from the CNN, and target endpoints was strengthened as outputs were extracted from the deeper layer of the CNN. |