DWE-Med
Autor: | Aidong Zhang, Guangxu Xun, Vishrawas Gopalakrishnan, Kishlay Jha |
---|---|
Rok vydání: | 2019 |
Předmět: |
0301 basic medicine
Text corpus Word embedding General Computer Science business.industry Computer science Context (language use) 02 engineering and technology computer.software_genre Data-driven Domain (software engineering) 03 medical and health sciences 030104 developmental biology 020204 information systems 0202 electrical engineering electronic engineering information engineering Embedding Artificial intelligence Representation (mathematics) business computer Word (computer architecture) Natural language processing |
Zdroj: | ACM Transactions on Knowledge Discovery from Data. 13:1-21 |
ISSN: | 1556-472X 1556-4681 |
DOI: | 10.1145/3310254 |
Popis: | Recent advances in unsupervised language processing methods have created an opportunity to exploit massive text corpora for developing high-quality vector space representation (also known as word embeddings) of words. Towards this direction, practitioners have developed and applied several data driven embedding models with quite good rate of success. However, a drawback of these models lies in their premise of static context; wherein, the meaning of a word is assumed to remain the same over the period of time. This is limiting because it is known that the semantic meaning of a concept evolves over time. While such semantic drifts are routinely observed in almost all the domains; their effect is acute in domain such as biomedicine, where the semantic meaning of a concept changes relatively fast. To address this, in this study, we aim to learn temporally aware vector representation of medical concepts from the timestamped text data, and in doing so provide a systematic approach to formalize the problem. More specifically, a dynamic word embedding based model that jointly learns the temporal characteristics of medical concepts and performs across time-alignment is proposed. Apart from capturing the evolutionary characteristics in an optimal manner, the model also factors in the implicit medical properties useful for a variety of bio-medical applications. Empirical studies conducted on two important bio-medical use cases validates the effectiveness of the proposed approach and suggests that the model not only learns quality embeddings but also facilitates intuitive trajectory visualizations. |
Databáze: | OpenAIRE |
Externí odkaz: |