Transferring a molecular foundation model for polymer property predictions

Autor:	Zhang, Pei, Kearney, Logan, Bhowmik, Debsindhu, Fox, Zachary, Naskar, Amit K., Gounley, John
Rok vydání:	2023
Předmět:	Computer Science - Machine Learning Physics - Chemical Physics
Druh dokumentu:	Working Paper
Popis:	Transformer-based large language models have remarkable potential to accelerate design optimization for applications such as drug development and materials discovery. Self-supervised pretraining of transformer models requires large-scale datasets, which are often sparsely populated in topical areas such as polymer science. State-of-the-art approaches for polymers conduct data augmentation to generate additional samples but unavoidably incurs extra computational costs. In contrast, large-scale open-source datasets are available for small molecules and provide a potential solution to data scarcity through transfer learning. In this work, we show that using transformers pretrained on small molecules and fine-tuned on polymer properties achieve comparable accuracy to those trained on augmented polymer datasets for a series of benchmark prediction tasks.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2310.16958 Zobrazit plný text záznamu View this record from Arxiv