English Multi-Word Expressions (MWE): A Tagset for Health Domain

Autor: Srishti Sing, Girish Nath Jha
Rok vydání: 2018
Předmět:
Zdroj: ICACCI
DOI: 10.1109/icacci.2018.8554795
Popis: This paper discusses the need for an independent MWE tagset for handling the technicality of healthcare domain for clinical English and reports efforts to develop a tagset, guideline and a tagger for healthcare domain. The tagset contains 12 tags and training a CRF based MWE tagger model to test the reliability of the tagset on medical data was performed with an accuracy of 73%. The tagger which is under continuous improvement through sanitizing the corpus, annotation and errors has improved to an accuracy of 79%.
Databáze: OpenAIRE