English Multi-Word Expressions (MWE): A Tagset for Health Domain
Autor: | Srishti Sing, Girish Nath Jha |
---|---|
Rok vydání: | 2018 |
Předmět: |
060201 languages & linguistics
business.industry Computer science 06 humanities and the arts 02 engineering and technology computer.software_genre Domain (software engineering) Annotation 0602 languages and literature 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing Word (computer architecture) |
Zdroj: | ICACCI |
DOI: | 10.1109/icacci.2018.8554795 |
Popis: | This paper discusses the need for an independent MWE tagset for handling the technicality of healthcare domain for clinical English and reports efforts to develop a tagset, guideline and a tagger for healthcare domain. The tagset contains 12 tags and training a CRF based MWE tagger model to test the reliability of the tagset on medical data was performed with an accuracy of 73%. The tagger which is under continuous improvement through sanitizing the corpus, annotation and errors has improved to an accuracy of 79%. |
Databáze: | OpenAIRE |
Externí odkaz: |