The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard

Autor: Rim Aboukhamis, Anita Burgun, Badisse Dahamna, Pierre Karapetiantz, Xiaoyi Chen, Nathalie Texier, Agnès Lillo-Le Louët, Carole Faviez, Yannick Girardeau, Sylvie Guillemin-Lanne, Myrtille Deldossi, Armelle Arnoux-Guenegou, Sandrine Katsahian
Přispěvatelé: Biomedical informatics and public health department, Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP), Centre de Recherche des Cordeliers (CRC (UMR_S_1138 / U1138)), École pratique des hautes études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU)-Université de Paris (UP), Kappa, Service d'informatique biomédicale [Rouen], CHU Rouen, Normandie Université (NU)-Normandie Université (NU)-Université de Rouen Normandie (UNIROUEN), Normandie Université (NU), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université Paris Diderot - Paris 7 (UPD7)-Université Paris Descartes - Paris 5 (UPD5)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU), Temis, Hôpital Européen Georges Pompidou [APHP] (HEGP), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Hôpitaux Universitaires Paris Ouest - Hôpitaux Universitaires Île de France Ouest (HUPO)
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Medical terminology
020205 medical informatics
Computer science
social media
02 engineering and technology
computer.software_genre
Terminology
03 medical and health sciences
0302 clinical medicine
Pharmacovigilance
0202 electrical engineering
electronic engineering
information engineering

Protocol
Social media
030212 general & internal medicine
natural language processing
Protocol (science)
Information retrieval
General Medicine
Gold standard (test)
data mining
MedDRA
3. Good health
Racine Pharma
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
Information extraction
Sample size determination
drug-related side effects and adverse reactions
[SDV.SP.PHARMA]Life Sciences [q-bio]/Pharmaceutical sciences/Pharmacology
[SDV.SPEE]Life Sciences [q-bio]/Santé publique et épidémiologie
computer
Zdroj: JMIR Research Protocols
JMIR Research Protocols, JMIR publications, 2019, 8 (5), pp.e11448. ⟨10.2196/11448⟩
ISSN: 1929-0748
DOI: 10.2196/11448⟩
Popis: Background: Social media is a potential source of information on postmarketing drug safety surveillance that still remains unexploited nowadays. Information technology solutions aiming at extracting adverse reactions (ADRs) from posts on health forums require a rigorous evaluation methodology if their results are to be used to make decisions. First, a gold standard, consisting of manual annotations of the ADR by human experts from the corpus extracted from social media, must be implemented and its quality must be assessed. Second, as for clinical research protocols, the sample size must rely on statistical arguments. Finally, the extraction methods must target the relation between the drug and the disease (which might be either treated or caused by the drug) rather than simple co-occurrences in the posts. Objective: We propose a standardized protocol for the evaluation of a software extracting ADRs from the messages on health forums. The study is conducted as part of the Adverse Drug Reactions from Patient Reports in Social Media project. Methods: Messages from French health forums were extracted. Entity recognition was based on Racine Pharma lexicon for drugs and Medical Dictionary for Regulatory Activities terminology for potential adverse events (AEs). Natural language processing–based techniques automated the ADR information extraction (relation between the drug and AE entities). The corpus of evaluation was a random sample of the messages containing drugs and/or AE concepts corresponding to recent pharmacovigilance alerts. A total of 2 persons experienced in medical terminology manually annotated the corpus, thus creating the gold standard, according to an annotator guideline. We will evaluate our tool against the gold standard with recall, precision, and f-measure. Interannotator agreement, reflecting gold standard quality, will be evaluated with hierarchical kappa. Granularities in the terminologies will be further explored. Results: Necessary and sufficient sample size was calculated to ensure statistical confidence in the assessed results. As we expected a global recall of 0.5, we needed at least 384 identified ADR concepts to obtain a 95% CI with a total width of 0.10 around 0.5. The automated ADR information extraction in the corpus for evaluation is already finished. The 2 annotators already completed the annotation process. The analysis of the performance of the ADR information extraction module as compared with gold standard is ongoing. Conclusions: This protocol is based on the standardized statistical methods from clinical research to create the corpus, thus ensuring the necessary statistical power of the assessed results. Such evaluation methodology is required to make the ADR information extraction software useful for postmarketing drug safety surveillance. International Registered Report Identifier (IRRID): RR1-10.2196/11448
Databáze: OpenAIRE