Semantically Enriching Embeddings of Highly Inflectable Verbs for Improving Intent Detection in a Romanian Home Assistant Scenario
Autor: | Anda-Diana Stoica, Rodica Potolea, Mihaela Dînșoreanu, Andrei-Cristian Rad, Camelia Lemnaru, Ioan-Horia-Mihai Muntean |
---|---|
Rok vydání: | 2021 |
Předmět: |
Hyperparameter
Word embedding business.industry Synonym Computer science Romanian 02 engineering and technology Representation (arts) computer.software_genre language.human_language Task (project management) Semantic similarity 020204 information systems 0202 electrical engineering electronic engineering information engineering language 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing Word (computer architecture) |
Zdroj: | Advances in Intelligent Data Analysis XIX ISBN: 9783030742508 IDA |
DOI: | 10.1007/978-3-030-74251-5_20 |
Popis: | Word embeddings are known to encapsulate semantic similarity and have become the preferred representation solution for NLP models. However, they fail to identify the type of semantic relationship, which – in some applications – might be crucial. This paper adapts an existing solution for enhancing word embedding representations such as to better separate between synonyms and antonyms in an intent detection task applied to a Romanian home assistant scenario. Accounting for the morphological richness of the Romanian language, our method proposes an additional augmentation step, in order to generate conjugated pairs of antonym and synonym verbs. The generated pairs are run through the counterfitting step (inspired from literature), for which we propose a justified improvement for one of the hyperparameters. The evaluations performed on the home assistant scenario have shown that the pre-processing step has an essential role in reducing opposing intent errors in the classification model (by almost two thirds). |
Databáze: | OpenAIRE |
Externí odkaz: |