Construction, evaluation, and application of an electronic medical record corpus for cerebral palsy rehabilitation.

Autor: Xiao M; Department of Rehabilitation Medicine, First Affiliated Hospital of Nanchang University, Nanchang, China.; Bioengineering College, Chongqing University, Chongqing, China., Pang Q; Bioengineering College, Chongqing University, Chongqing, China., Zhu Y; Department of Rehabilitation Medicine, First Affiliated Hospital of Nanchang University, Nanchang, China., Shuai L; Department of Rehabilitation Medicine, First Affiliated Hospital of Nanchang University, Nanchang, China., Jin G; Department of Rehabilitation Medicine, First Affiliated Hospital of Nanchang University, Nanchang, China.
Jazyk: angličtina
Zdroj: Digital health [Digit Health] 2024 Sep 27; Vol. 10, pp. 20552076241286260. Date of Electronic Publication: 2024 Sep 27 (Print Publication: 2024).
DOI: 10.1177/20552076241286260
Abstrakt: Objective: The electronic medical records (EMRs) corpus for cerebral palsy rehabilitation and its application in downstream tasks, such as named entity recognition (NER), requires further revision and testing to enhance its effectiveness and reliability.
Methods: We have devised an annotation principle and have developed an EMRs corpus for cerebral palsy rehabilitation. The introduction of test-retest reliability was employed for the first time to ensure consistency of each annotator. Additionally, we established a baseline NER model using the proposed EMRs corpus. The NER model leveraged Chinese clinical BERT and adversarial training as the embedding layer, and incorporated multi-head attention mechanism and rotary position embedding in the encoder layer. For multi-label decoding, we employed the span matrix of global pointer along with softmax and cross-entropy.
Results: The corpus consisted of 1405 EMRs, containing a total of 127,523 entities across six different entity types, with 24,424 unique entities after de-duplication. The inter-annotator agreement of two annotators was 97.57%, the intra-annotator agreement of each annotator exceeded 98%. Our proposed baseline NER model demonstrates impressive performance, achieving a F1-score of 93.59% for flat entities and 90.15% for nested entities in this corpus.
Conclusions: We believe that the proposed annotation principle, corpus, and baseline model are highly effective and hold great potential as tools for cerebral palsy rehabilitation scenarios.
Competing Interests: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
(© The Author(s) 2024.)
Databáze: MEDLINE