Knowledge-Guided Efficient Representation Learning for Biomedical Domain

Autor: Guangxu Xun, Aidong Zhang, Nan Du, Kishlay Jha
Rok vydání: 2021
Předmět:
Zdroj: KDD
DOI: 10.1145/3447548.3467118
Popis: Pre-trained concept representations are essential to many biomedical text mining and natural language processing tasks. As such, various representation learning approaches have been proposed in the literature. More recently, contextualized embedding approaches (i.e., BERT based models) that capture the implicit semantics of concepts at a granular level have significantly outperformed the conventional word embedding approaches (i.e., Word2Vec/GLoVE based models). Despite significant accuracy gains achieved, these approaches are often computationally expensive and memory inefficient. To address this issue, we propose a new representation learning approach that efficiently adapts the concept representations to the newly available data. Specifically, the proposed approach develops a knowledge-guided continual learning strategy wherein the accurate/stable context-information present in human-curated knowledge-bases is exploited to continually identify and retrain the representations of those concepts whose corpus-based context evolved coherently over time. Different from previous studies that mainly leverage the curated knowledge to improve the accuracy of embedding models, the proposed research explores the usefulness of semantic knowledge from the perspective of accelerating the training efficiency of embedding models. Comprehensive experiments under various efficiency constraints demonstrate that the proposed approach significantly improves the computational performance of biomedical word embedding models.
Databáze: OpenAIRE