The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard

Autor:	Rim Aboukhamis, Anita Burgun, Badisse Dahamna, Pierre Karapetiantz, Xiaoyi Chen, Nathalie Texier, Agnès Lillo-Le Louët, Carole Faviez, Yannick Girardeau, Sylvie Guillemin-Lanne, Myrtille Deldossi, Armelle Arnoux-Guenegou, Sandrine Katsahian
Přispěvatelé:	Biomedical informatics and public health department, Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP), Centre de Recherche des Cordeliers (CRC (UMR_S_1138 / U1138)), École pratique des hautes études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU)-Université de Paris (UP), Kappa, Service d'informatique biomédicale [Rouen], CHU Rouen, Normandie Université (NU)-Normandie Université (NU)-Université de Rouen Normandie (UNIROUEN), Normandie Université (NU), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université Paris Diderot - Paris 7 (UPD7)-Université Paris Descartes - Paris 5 (UPD5)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU), Temis, Hôpital Européen Georges Pompidou [APHP] (HEGP), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Hôpitaux Universitaires Paris Ouest - Hôpitaux Universitaires Île de France Ouest (HUPO)
Jazyk:	angličtina
Rok vydání:	2019
Předmět:	Medical terminology 020205 medical informatics Computer science social media 02 engineering and technology computer.software_genre Terminology 03 medical and health sciences 0302 clinical medicine Pharmacovigilance 0202 electrical engineering electronic engineering information engineering Protocol Social media 030212 general & internal medicine natural language processing Protocol (science) Information retrieval General Medicine Gold standard (test) data mining MedDRA 3. Good health Racine Pharma [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing Information extraction Sample size determination drug-related side effects and adverse reactions [SDV.SP.PHARMA]Life Sciences [q-bio]/Pharmaceutical sciences/Pharmacology [SDV.SPEE]Life Sciences [q-bio]/Santé publique et épidémiologie computer
Zdroj:	JMIR Research Protocols JMIR Research Protocols, JMIR publications, 2019, 8 (5), pp.e11448. ⟨10.2196/11448⟩
ISSN:	1929-0748
DOI:	10.2196/11448⟩
Popis:	Background: Social media is a potential source of information on postmarketing drug safety surveillance that still remains unexploited nowadays. Information technology solutions aiming at extracting adverse reactions (ADRs) from posts on health forums require a rigorous evaluation methodology if their results are to be used to make decisions. First, a gold standard, consisting of manual annotations of the ADR by human experts from the corpus extracted from social media, must be implemented and its quality must be assessed. Second, as for clinical research protocols, the sample size must rely on statistical arguments. Finally, the extraction methods must target the relation between the drug and the disease (which might be either treated or caused by the drug) rather than simple co-occurrences in the posts. Objective: We propose a standardized protocol for the evaluation of a software extracting ADRs from the messages on health forums. The study is conducted as part of the Adverse Drug Reactions from Patient Reports in Social Media project. Methods: Messages from French health forums were extracted. Entity recognition was based on Racine Pharma lexicon for drugs and Medical Dictionary for Regulatory Activities terminology for potential adverse events (AEs). Natural language processing–based techniques automated the ADR information extraction (relation between the drug and AE entities). The corpus of evaluation was a random sample of the messages containing drugs and/or AE concepts corresponding to recent pharmacovigilance alerts. A total of 2 persons experienced in medical terminology manually annotated the corpus, thus creating the gold standard, according to an annotator guideline. We will evaluate our tool against the gold standard with recall, precision, and f-measure. Interannotator agreement, reflecting gold standard quality, will be evaluated with hierarchical kappa. Granularities in the terminologies will be further explored. Results: Necessary and sufficient sample size was calculated to ensure statistical confidence in the assessed results. As we expected a global recall of 0.5, we needed at least 384 identified ADR concepts to obtain a 95% CI with a total width of 0.10 around 0.5. The automated ADR information extraction in the corpus for evaluation is already finished. The 2 annotators already completed the annotation process. The analysis of the performance of the ADR information extraction module as compared with gold standard is ongoing. Conclusions: This protocol is based on the standardized statistical methods from clinical research to create the corpus, thus ensuring the necessary statistical power of the assessed results. Such evaluation methodology is required to make the ADR information extraction software useful for postmarketing drug safety surveillance. International Registered Report Identifier (IRRID): RR1-10.2196/11448
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2480c0081f5e67d8ace0dca8494212f7 http://europepmc.org/articles/PMC6528435 Zobrazit plný text záznamu