Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System

Autor:	Catherine Thomas, Leonardo Campillos-Llanos, Pierre Zweigenbaum, Sophie Rosset, Antoine Neuraz, Eric Bilinski
Přispěvatelé:	Instituto de Lengua Literatura y Antropología (ILLA), Consejo Superior de Investigaciones Científicas [Madrid] (CSIC), Information, Langue Ecrite et Signée (ILES), Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Sciences et Technologies des Langues (STL), CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Accompagnement et Soutien aux Activités de Recherche & Développement (ASARD), Métabolisme, Cancer et Immunité (CRC - UMR_S 1138), Institut Gustave Roussy (IGR)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU)-Université de Paris (UP)-Centre de Recherche des Cordeliers (CRC (UMR_S_1138 / U1138)), École pratique des hautes études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU)-Université de Paris (UP)-École pratique des hautes études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Paris (UP), Neuraz, Antoine, Laboratoire d'Intégrité des Structures et de Normalisation (LISN), Service d'Etudes Mécaniques et Thermiques (SEMT), Département de Modélisation des Systèmes et Structures (DM2S), CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-CEA-Direction des Energies (ex-Direction de l'Energie Nucléaire) (CEA-DES (ex-DEN)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Département de Modélisation des Systèmes et Structures (DM2S), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Société d'accélération du Transfert de Technologie IDF-Innov [Paris] (SATT), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Service d'informatique médicale et biostatistiques [CHU Necker], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-CHU Necker - Enfants Malades [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP), This work was funded by BPI (FUI Project PatientGenesys, F1310002-P) and by the Société d’Accélération de Transfert Technologique (SATT) Paris Saclay (PVDial project). The funding bodies did not take part in the design of the study, analysis and interpretation of data and writing the manuscript., Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Sciences et Technologies des Langues (STL), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Zweigenbaum, Pierre
Rok vydání:	2021
Předmět:	Virtual patient medicine.medical_specialty Artificial intelligence Correctness Students Medical 020205 medical informatics Computer science Trainer Medicine (miscellaneous) [INFO.INFO-TT] Computer Science [cs]/Document and Text Processing Health Informatics 02 engineering and technology Health informatics Simulated patient [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Task (project management) Education User-Computer Interface Health Information Management Surveys and Questionnaires Medical 0202 electrical engineering electronic engineering information engineering medicine Humans Medical physics Medical history taking business.industry Natural language processing Usability [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] business Natural language Information Systems
Zdroj:	Digital.CSIC. Repositorio Institucional del CSIC instname Journal of Medical Systems Journal of Medical Systems, Springer Verlag (Germany), 2021, 45 (7), ⟨10.1007/s10916-021-01737-4⟩ Journal of Medical Systems, 2021, 45 (7), pp.69. ⟨10.1007/s10916-021-01737-4⟩ Journal of Medical Systems, 2021, 45 (7), ⟨10.1007/s10916-021-01737-4⟩ Digital.CSIC: Repositorio Institucional del CSIC Consejo Superior de Investigaciones Científicas (CSIC)
ISSN:	0148-5598 1573-689X
DOI:	10.1007/s10916-021-01737-4⟩
Popis:	International audience; Simulated consultations through virtual patients allow medical students to practice history-taking skills. Ideally, applications should provide interactions in natural language and be multi-case, multi-specialty. Nevertheless, few systems handle or are tested on a large variety of cases. We present a virtual patient dialogue system in which a medical trainer types new cases and these are processed without human intervention. To develop it, we designed a patient record model, a knowledge model for the history-taking task, and a termino-ontological model for term variation and out-of-vocabulary words. We evaluated whether this system provided quality dialogue across medical specialities (n = 18), and with unseen cases (n = 29) compared to the cases used for development (n = 6). Medical evaluators (students, residents, practitioners, and researchers) conducted simulated history-taking with the system and assessed its performance through Likert-scale questionnaires. We analysed interaction logs and evaluated system correctness. The mean user evaluation score for the 29 unseen cases was 4.06 out of 5 (very good). The evaluation of correctness determined that, on average, 74.3% (sd = 9.5) of replies were correct, 14.9% (sd = 6.3) incorrect, and in 10.7% the system behaved cautiously by deferring a reply. In the user evaluation, all aspects scored higher in the 29 unseen cases than in the 6 seen cases. Although such a multi-case system has its limits, the evaluation showed that creating it is feasible; that it performs adequately; and that it is judged usable. We discuss some lessons learned and pivotal design choices affecting its performance and the end-users, who are primarily medical students.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e5d81d3b7aa89a64dc6519b05e21fb9c http://hdl.handle.net/10261/243697 Zobrazit plný text záznamu Full text from SpringerLink