UCSD-Adobe at MEDIQA 2021: Transfer Learning and Answer Sentence Selection for Medical Summarization
Autor: | Franck Dernoncourt, Emilias Farcas, Ndapa Nakashole, Khalil Mrini, Seung-Hyun Yoon, Trung Bui, Walter Chang |
---|---|
Rok vydání: | 2021 |
Předmět: |
0303 health sciences
Computer science business.industry Context (language use) 010501 environmental sciences computer.software_genre 01 natural sciences Automatic summarization Task (project management) Domain (software engineering) 03 medical and health sciences Selection (linguistics) Leverage (statistics) Artificial intelligence business Transfer of learning computer Sentence Natural language processing 030304 developmental biology 0105 earth and related environmental sciences |
Zdroj: | BioNLP@NAACL-HLT |
DOI: | 10.18653/v1/2021.bionlp-1.28 |
Popis: | In this paper, we describe our approach to question summarization and multi-answer summarization in the context of the 2021 MEDIQA shared task (Ben Abacha et al., 2021). We propose two kinds of transfer learning for the abstractive summarization of medical questions. First, we train on HealthCareMagic, a large question summarization dataset collected from an online healthcare service platform. Second, we leverage the ability of the BART encoder-decoder architecture to model both generation and classification tasks to train on the task of Recognizing Question Entailment (RQE) in the medical domain. We show that both transfer learning methods combined achieve the highest ROUGE scores. Finally, we cast the question-driven extractive summarization of multiple relevant answer documents as an Answer Sentence Selection (AS2) problem. We show how we can preprocess the MEDIQA-AnS dataset such that it can be trained in an AS2 setting. Our AS2 model is able to generate extractive summaries achieving high ROUGE scores. |
Databáze: | OpenAIRE |
Externí odkaz: |