Transformer-based structuring of free-text radiology report databases.
Autor: | Nowak S; Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany. sebastian.nowak@ukbonn.de., Biesner D; Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Sankt Augustin, Germany., Layer YC; Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany., Theis M; Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany., Schneider H; Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Sankt Augustin, Germany., Block W; Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany., Wulff B; Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Sankt Augustin, Germany., Attenberger UI; Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany., Sifa R; Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Sankt Augustin, Germany., Sprinkart AM; Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany. |
---|---|
Jazyk: | angličtina |
Zdroj: | European radiology [Eur Radiol] 2023 Jun; Vol. 33 (6), pp. 4228-4236. Date of Electronic Publication: 2023 Mar 11. |
DOI: | 10.1007/s00330-023-09526-y |
Abstrakt: | Objectives: To provide insights for on-site development of transformer-based structuring of free-text report databases by investigating different labeling and pre-training strategies. Methods: A total of 93,368 German chest X-ray reports from 20,912 intensive care unit (ICU) patients were included. Two labeling strategies were investigated to tag six findings of the attending radiologist. First, a system based on human-defined rules was applied for annotation of all reports (termed "silver labels"). Second, 18,000 reports were manually annotated in 197 h (termed "gold labels") of which 10% were used for testing. An on-site pre-trained model (T Results: T Conclusions: Custom pre-training of transformers and fine-tuning on manual annotations promises to be an efficient strategy to unlock report databases for data-driven medicine. Key Points: • On-site development of natural language processing methods that retrospectively unlock free-text databases of radiology clinics for data-driven medicine is of great interest. • For clinics seeking to develop methods on-site for retrospective structuring of a report database of a certain department, it remains unclear which of previously proposed strategies for labeling reports and pre-training models is the most appropriate in context of, e.g., available annotator time. • Using a custom pre-trained transformer model, along with a little annotation effort, promises to be an efficient way to retrospectively structure radiological databases, even if not millions of reports are available for pre-training. (© 2023. The Author(s).) |
Databáze: | MEDLINE |
Externí odkaz: |