A Natural Language Processing Model to Identify Confidential Content in Adolescent Clinical Notes.
Autor: | Rabbani N; Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States., Bedgood M; California Department of Public Health, Richmond, California, United States., Brown C; Information Services Department, Lucile Packard Children's Hospital, Palo Alto, California, United States., Steinberg E; Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, California, United States.; Department of Computer Science, Stanford University, Stanford, California, United States., Goldstein RL; Division of Adolescent Medicine, Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States., Carlson JL; Division of Adolescent Medicine, Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States., Pageler N; Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States., Morse KE; Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States. |
---|---|
Jazyk: | angličtina |
Zdroj: | Applied clinical informatics [Appl Clin Inform] 2023 May; Vol. 14 (3), pp. 400-407. Date of Electronic Publication: 2023 Mar 10. |
DOI: | 10.1055/a-2051-9764 |
Abstrakt: | Background: The 21st Century Cures Act mandates the immediate, electronic release of health information to patients. However, in the case of adolescents, special consideration is required to ensure that confidentiality is maintained. The detection of confidential content in clinical notes may support operational efforts to preserve adolescent confidentiality while implementing information sharing. Objectives: This study aimed to determine if a natural language processing (NLP) algorithm can identify confidential content in adolescent clinical progress notes. Methods: A total of 1,200 outpatient adolescent progress notes written between 2016 and 2019 were manually annotated to identify confidential content. Labeled sentences from this corpus were featurized and used to train a two-part logistic regression model, which provides both sentence-level and note-level probability estimates that a given text contains confidential content. This model was prospectively validated on a set of 240 progress notes written in May 2022. It was subsequently deployed in a pilot intervention to augment an ongoing operational effort to identify confidential content in progress notes. Note-level probability estimates were used to triage notes for review and sentence-level probability estimates were used to highlight high-risk portions of those notes to aid the manual reviewer. Results: The prevalence of notes containing confidential content was 21% (255/1,200) and 22% (53/240) in the train/test and validation cohorts, respectively. The ensemble logistic regression model achieved an area under the receiver operating characteristic of 90 and 88% in the test and validation cohorts, respectively. Its use in a pilot intervention identified outlier documentation practices and demonstrated efficiency gains over completely manual note review. Conclusion: An NLP algorithm can identify confidential content in progress notes with high accuracy. Its human-in-the-loop deployment in clinical operations augmented an ongoing operational effort to identify confidential content in adolescent progress notes. These findings suggest NLP may be used to support efforts to preserve adolescent confidentiality in the wake of the information blocking mandate. Competing Interests: None declared. (Thieme. All rights reserved.) |
Databáze: | MEDLINE |
Externí odkaz: |