RNNs for Lithuanian Multiword Expressions Identification.

Autor: Bumbulienė, I., Mandravickaitė, J., Bielinskienė, A., Boizou, L., Kovalevskaitė, J., Rimkutė, E., Vilkaitė-Lozdienė, L., Man, K. L., Krilavičius, T.
Předmět:
Zdroj: International Journal of Design, Analysis & Tools for Integrated Circuits & Systems; Oct2018, Vol. 7 Issue 1, p44-47, 4p
Abstrakt: We discuss an experiment on automatic identification of multiword expressions (MWEs) in Lithuanian corpus. Our training dataset was annotated morphologically (POS tagger). It was manually annotated with MWEs by 4 linguists as well. We also used word embeddings in our feature set. Deep learning methods are widely used in many NLP tasks and applications including MWEs identification. Thus, our experimental setup included deep learning methods (Recurrent Neural Networks,RNNs) and was used for automatic identification of contiguous and non-contiguous MWEs of different length. Best results (44.9% F1-Score) were achieved with RNNs and Stochastic Gradient Descent as optimizer together with Categorical Cross Entropy as loss function. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index