Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently
Autor: | Neil Swainston, Soumitra Samanta, Douglas B. Kell |
---|---|
Rok vydání: | 2020 |
Předmět: |
Artificial intelligence
Bioinformatics Computer science Space (commercial competition) 010402 general chemistry 01 natural sciences Biochemistry 03 medical and health sciences Deep Learning Discriminative model Computer Simulation Representation (mathematics) Review Articles Molecular Biology 030304 developmental biology 0303 health sciences Artificial neural network business.industry Cheminformatics Deep learning Cell Biology artificial intelligence Chemical space 0104 chemical sciences Variety (cybernetics) business |
Zdroj: | Kell, D B, Samanta, S & Swainston, N 2020, ' Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently ', Biochemical Journal, vol. 477, no. 23, pp. 4559-4580 . https://doi.org/10.1042/BCJ20200781 Biochemical Journal |
ISSN: | 1470-8728 0264-6021 |
Popis: | The number of ‘small’ molecules that may be of interest to chemical biologists — chemical space — is enormous, but the fraction that have ever been made is tiny. Most strategies are discriminative, i.e. have involved ‘forward’ problems (have molecule, establish properties). However, we normally wish to solve the much harder generative or inverse problem (describe desired properties, find molecule). ‘Deep’ (machine) learning based on large-scale neural networks underpins technologies such as computer vision, natural language processing, driverless cars, and world-leading performance in games such as Go; it can also be applied to the solution of inverse problems in chemical biology. In particular, recent developments in deep learning admit the in silico generation of candidate molecular structures and the prediction of their properties, thereby allowing one to navigate (bio)chemical space intelligently. These methods are revolutionary but require an understanding of both (bio)chemistry and computer science to be exploited to best advantage. We give a high-level (non-mathematical) background to the deep learning revolution, and set out the crucial issue for chemical biology and informatics as a two-way mapping from the discrete nature of individual molecules to the continuous but high-dimensional latent representation that may best reflect chemical space. A variety of architectures can do this; we focus on a particular type known as variational autoencoders. We then provide some examples of recent successes of these kinds of approach, and a look towards the future. |
Databáze: | OpenAIRE |
Externí odkaz: |