How You Ask Matters: The Effect of Paraphrastic Questions to BERT Performance on a Clinical SQuAD Dataset

Autor:	Jungwei Fan, Sungrim (Riea) Moon
Rok vydání:	2020
Předmět:	0303 health sciences business.industry Computer science 02 engineering and technology computer.software_genre 03 medical and health sciences Reading comprehension Ask price 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Syntactic structure Artificial intelligence business computer Natural language processing 030304 developmental biology Transformer (machine learning model)
Zdroj:	ClinicalNLP@EMNLP
Popis:	Reading comprehension style question-answering (QA) based on patient-specific documents represents a growing area in clinical NLP with plentiful applications. Bidirectional Encoder Representations from Transformers (BERT) and its derivatives lead the state-of-the-art accuracy on the task, but most evaluation has treated the data as a pre-mixture without systematically looking into the potential effect of imperfect train/test questions. The current study seeks to address this gap by experimenting with full versus partial train/test data consisting of paraphrastic questions. Our key findings include 1) training with all pooled question variants yielded best accuracy, 2) the accuracy varied widely, from 0.74 to 0.80, when trained with each single question variant, and 3) questions of similar lexical/syntactic structure tended to induce identical answers. The results suggest that how you ask questions matters in BERT-based QA, especially at the training stage.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::e7a80080a15069c92acef5609526ecaa https://doi.org/10.18653/v1/2020.clinicalnlp-1.13 Zobrazit plný text záznamu