Pre-trained Language Model for Biomedical Question Answering
Autor: | Jinhyuk Lee, Wonjin Yoon, Donghyeon Kim, Jaewoo Kang, Minbyul Jeong |
---|---|
Rok vydání: | 2020 |
Předmět: |
Structure (mathematical logic)
0303 health sciences Computer science business.industry Factoid computer.software_genre Task (project management) Domain (software engineering) 03 medical and health sciences 0302 clinical medicine Question answering Language model Artificial intelligence Transfer of learning business computer 030217 neurology & neurosurgery Natural language processing 030304 developmental biology |
Zdroj: | Machine Learning and Knowledge Discovery in Databases ISBN: 9783030438869 PKDD/ECML Workshops (2) |
DOI: | 10.1007/978-3-030-43887-6_64 |
Popis: | The recent success of question answering systems is largely attributed to pre-trained language models. However, as language models are mostly pre-trained on general domain corpora such as Wikipedia, they often have difficulty in understanding biomedical questions. In this paper, we investigate the performance of BioBERT, a pre-trained biomedical language model, in answering biomedical questions including factoid, list, and yes/no type questions. BioBERT uses almost the same structure across various question types and achieved the best performance in the 7th BioASQ Challenge (Task 7b, Phase B). BioBERT pre-trained on SQuAD or SQuAD 2.0 easily outperformed previous state-of-the-art models. BioBERT obtains the best performance when it uses the appropriate pre-/post-processing strategies for questions, passages, and answers. |
Databáze: | OpenAIRE |
Externí odkaz: |