DWE-Med

Autor: Aidong Zhang, Guangxu Xun, Vishrawas Gopalakrishnan, Kishlay Jha
Rok vydání: 2019
Předmět:
Zdroj: ACM Transactions on Knowledge Discovery from Data. 13:1-21
ISSN: 1556-472X
1556-4681
DOI: 10.1145/3310254
Popis: Recent advances in unsupervised language processing methods have created an opportunity to exploit massive text corpora for developing high-quality vector space representation (also known as word embeddings) of words. Towards this direction, practitioners have developed and applied several data driven embedding models with quite good rate of success. However, a drawback of these models lies in their premise of static context; wherein, the meaning of a word is assumed to remain the same over the period of time. This is limiting because it is known that the semantic meaning of a concept evolves over time. While such semantic drifts are routinely observed in almost all the domains; their effect is acute in domain such as biomedicine, where the semantic meaning of a concept changes relatively fast. To address this, in this study, we aim to learn temporally aware vector representation of medical concepts from the timestamped text data, and in doing so provide a systematic approach to formalize the problem. More specifically, a dynamic word embedding based model that jointly learns the temporal characteristics of medical concepts and performs across time-alignment is proposed. Apart from capturing the evolutionary characteristics in an optimal manner, the model also factors in the implicit medical properties useful for a variety of bio-medical applications. Empirical studies conducted on two important bio-medical use cases validates the effectiveness of the proposed approach and suggests that the model not only learns quality embeddings but also facilitates intuitive trajectory visualizations.
Databáze: OpenAIRE