A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language
Autor: | Sequeira, João, Teresa Gonçalves, Quaresma, Paulo, Mendes, Amália, Hendrickx, Iris |
---|---|
Přispěvatelé: | Repositório da Universidade de Lisboa |
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: | |
Zdroj: | Repositório Científico de Acesso Aberto de Portugal Repositório Científico de Acesso Aberto de Portugal (RCAAP) instacron:RCAAP Teresa Gonçalves Scopus-Elsevier CIÊNCIAVITAE |
Popis: | This work presents a comparative study between two different approaches to build an automatic classification system for Modalityvalues in the Portuguese language. One approach uses a single multi-class classifier with the full dataset that includes eleven modal verbs; the other builds different classifiers, one for each verb. The performance is measured using precision, recall and F1. Due to the unbalanced nature of the dataset a weighted average approach was calculated for each metric. We use support vector machines as ourclassifier and experimented with various SVM kernels to find the optimal classifier for the task at hand. We experimented with several different types of feature attributes representing parse tree information and compare these complex feature representation against a simple bag-of-words feature representation as baseline. The best obtained F1values are above 0.60 and from the results it is possible to conclude that there is no significant difference between both approaches. |
Databáze: | OpenAIRE |
Externí odkaz: |