A method for named entity recognition in social media texts with syntactically enhanced multiscale feature fusion

Autor: Yuhan Li, Yang Zhou, Xiaofei Hu, Qingxiang Li, Jiali Tian
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Scientific Reports, Vol 14, Iss 1, Pp 1-12 (2024)
Druh dokumentu: article
ISSN: 2045-2322
DOI: 10.1038/s41598-024-78948-5
Popis: Abstract Social media data are characterized by significant noise and non-standardization, thereby posing challenges for existing methods in recognizing named entities owing to the entity sparsity and insufficient semantic richness. Thus, to deal with these issues, this study proposes SEMFF-NER, a named entity recognition (NER) method in social media texts that integrates multi-scale features and syntactic information. First, global features are extracted using a Transformer-based encoder (XLNET) with embedded dependency syntactic relations to enhance semantic representation. Next, sliding windows of different lengths capture local features, which are input into a bi-directional long short-term memory (BiLSTM) to capture multi-level local features. Subsequently, the fusion-attention mechanism effectively integrates global contextual information with multiple local features to predict the optimal entity labels. Extensive experiments conducted on three datasets collected from English social media platforms (WNUT2016, WNUT2017, OntoNotes5.0_English) demonstrate the advantageous performance of our proposed method, and ablation experiments further confirm the method’s viability and effectiveness.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje