One class classification as a practical approach for accelerating π-π co-crystal discovery.

Autor: Vriza A; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk.; Leverhulme Research Centre for Functional Materials Design, University of Liverpool Oxford Street Liverpool L7 3NY UK., Canaj AB; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk., Vismara R; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk., Kershaw Cook LJ; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk., Manning TD; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk., Gaultois MW; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk.; Leverhulme Research Centre for Functional Materials Design, University of Liverpool Oxford Street Liverpool L7 3NY UK., Wood PA; Cambridge Crystallographic Data Centre 12 Union Road Cambridge CB2 1EZ UK., Kurlin V; Materials Innovation Factory, Computer Science Department, University of Liverpool Liverpool L69 3BX UK., Berry N; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk., Dyer MS; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk.; Leverhulme Research Centre for Functional Materials Design, University of Liverpool Oxford Street Liverpool L7 3NY UK., Rosseinsky MJ; Department of Chemistry and Materials Innovation Factory, University of Liverpool 51 Oxford Street Liverpool L7 3NY UK M.S.Dyer@liverpool.ac.uk.; Leverhulme Research Centre for Functional Materials Design, University of Liverpool Oxford Street Liverpool L7 3NY UK.
Jazyk: angličtina
Zdroj: Chemical science [Chem Sci] 2020 Dec 08; Vol. 12 (5), pp. 1702-1719. Date of Electronic Publication: 2020 Dec 08.
DOI: 10.1039/d0sc04263c
Abstrakt: The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6 H -benzo[ c ]chromen-6-one ( 1 ) and pyrene-9,10-dicyanoanthracene ( 2 ).
Competing Interests: There are no conflicts to declare.
(This journal is © The Royal Society of Chemistry.)
Databáze: MEDLINE