Automatic text classification of prostate cancer malignancy scores in radiology reports using NLP models.

Autor: Collado-Montañez J; Department of Computer Science, Advanced Studies Center in ICT (CEATIC), Universidad de Jaén, Campus Las Lagunillas, Jaén, 23071, Spain. jcollado@ujaen.es., López-Úbeda P; Natural Language Processing Unit, HT Médica, Carmelo Torres, no̱2, Jaén, 23007, Spain., Chizhikova M; Department of Computer Science, Advanced Studies Center in ICT (CEATIC), Universidad de Jaén, Campus Las Lagunillas, Jaén, 23071, Spain., Díaz-Galiano MC; Department of Computer Science, Advanced Studies Center in ICT (CEATIC), Universidad de Jaén, Campus Las Lagunillas, Jaén, 23071, Spain., Ureña-López LA; Department of Computer Science, Advanced Studies Center in ICT (CEATIC), Universidad de Jaén, Campus Las Lagunillas, Jaén, 23071, Spain., Martín-Noguerol T; MRI Unit, Radiology Department, HT Médica, Carmelo Torres, no̱2, Jaén, 23007, Spain., Luna A; MRI Unit, Radiology Department, HT Médica, Carmelo Torres, no̱2, Jaén, 23007, Spain., Martín-Valdivia MT; Department of Computer Science, Advanced Studies Center in ICT (CEATIC), Universidad de Jaén, Campus Las Lagunillas, Jaén, 23071, Spain.
Jazyk: angličtina
Zdroj: Medical & biological engineering & computing [Med Biol Eng Comput] 2024 Nov; Vol. 62 (11), pp. 3373-3383. Date of Electronic Publication: 2024 Jun 07.
DOI: 10.1007/s11517-024-03131-x
Abstrakt: This paper presents the implementation of two automated text classification systems for prostate cancer findings based on the PI-RADS criteria. Specifically, a traditional machine learning model using XGBoost and a language model-based approach using RoBERTa were employed. The study focused on Spanish-language radiological MRI prostate reports, which has not been explored before. The results demonstrate that the RoBERTa model outperforms the XGBoost model, although both achieve promising results. Furthermore, the best-performing system was integrated into the radiological company's information systems as an API, operating in a real-world environment.
(© 2024. The Author(s).)
Databáze: MEDLINE