A Framework for Understanding the Role of Morphology in Universal Dependency Parsing
Autor: | Pascal Denis, Mathieu Dehouck |
---|---|
Přispěvatelé: | Machine Learning in Information Networks (MAGNET), Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Inria Lille - Nord Europe, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Inria Lille - Nord Europe, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS) |
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
Morphology (linguistics)
Parsing Computer science business.industry 02 engineering and technology 010501 environmental sciences computer.software_genre 01 natural sciences Measure (mathematics) Syntax [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] Simple (abstract algebra) Dependency grammar 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence [SHS.LANGUE]Humanities and Social Sciences/Linguistics business computer Word (computer architecture) Natural language processing 0105 earth and related environmental sciences |
Zdroj: | EMNLP 2018-Conference on Empirical Methods in Natural Language Processing EMNLP 2018-Conference on Empirical Methods in Natural Language Processing, Oct 2018, Brussels, Belgium HAL EMNLP |
Popis: | This paper presents a simple framework for characterizing morphological complexity and how it encodes syntactic information. In particular, we propose a new measure of morphosyntactic complexity in terms of governordependent preferential attachment that explains parsing performance. Through experiments on dependency parsing with data from Universal Dependencies (UD), we show that representations derived from morphological attributes deliver important parsing performance improvements over standard word form embeddings when trained on the same datasets. We also show that the new morphosyntactic complexity measure is predictive of the gains provided by using morphological attributes over plain forms on parsing scores, making it a tool to distinguish languages using morphology as a syntactic marker from others. |
Databáze: | OpenAIRE |
Externí odkaz: |