Emu: Enhancing Multilingual Sentence Embeddings with Semantic Specialization
Autor: | Wataru Hirota, Yoshihiko Suhara, Wang-Chiew Tan, Behzad Golshan |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Computer Science - Computation and Language business.industry Computer science InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL General Medicine computer.software_genre ComputingMethodologies_ARTIFICIALINTELLIGENCE Task (project management) Machine Learning (cs.LG) ComputingMethodologies_PATTERNRECOGNITION Semantic similarity Specialization (logic) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING Embedding Artificial intelligence business computer Classifier (UML) Computation and Language (cs.CL) Natural language processing Sentence |
Zdroj: | AAAI |
Popis: | We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework fine-tunes pre-trained multilingual sentence embeddings using two main components: a semantic classifier and a language discriminator. The semantic classifier improves the semantic similarity of related sentences, whereas the language discriminator enhances the multilinguality of the embeddings via multilingual adversarial training. Our experimental results based on several language pairs show that our specialized embeddings outperform the state-of-the-art multilingual sentence embedding model on the task of cross-lingual intent classification using only monolingual labeled data. AAAI 2020 |
Databáze: | OpenAIRE |
Externí odkaz: |