LITL at SMM4H: an old-school feature-based classifier for identifying adverse effects in Tweets

Autor: Tanguy, Ludovic, Ho-Dac, Lydia-Mai, Fabre, Cécile, Bois, Roxane, Haddad, Touati Mohamed Yacine, Ibarboure, Claire, Joyau, Marie, Le Moal, François, Moillic, Jade, Roudaut, Laura, Simounet, Mathilde, Stankovic, Irena, Vanderwaetere, Mickaela
Přispěvatelé: Cognition, Langues, Langage, Ergonomie (CLLE), Centre National de la Recherche Scientifique (CNRS)-École pratique des hautes études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université Toulouse - Jean Jaurès (UT2J), Université Toulouse - Jean Jaurès (UT2J), Graciela Gonzalez-Hernandez, Ari Z. Klein, Davy Weissenbacher, Arjun Magge, Karen O'Connor, Abeed Sarker, Anne-Lyse Minard, Elena Tutubalina, Zulfat Miftahutdinov, Ilseyar Alimova, Ivan Flores, Ho-Dac, Lydia-Mai
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task
Graciela Gonzalez-Hernandez; Ari Z. Klein; Davy Weissenbacher; Arjun Magge; Karen O'Connor; Abeed Sarker; Anne-Lyse Minard,; Elena Tutubalina; Zulfat Miftahutdinov; Ilseyar Alimova; Ivan Flores. Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task, Association for Computational Linguistics, pp.134-137, 2020
Popis: International audience; This paper describes our participation to the SMM4H shared task 2. We designed a linear classifier that estimates whether a tweet mentions an adverse effect associated to a medication. Our system addresses English and French, and is based on a number of ad-hoc word lists and features. These cues were mostly obtained through an extensive corpus analysis of the provided training data. Different weighting schemes were tested (manually tuned or based on a logistic regression), the best one achieving a F1 score of 0.31 for English and 0.15 for French.
Databáze: OpenAIRE