Autor: |
Yuhan Li, Yang Zhou, Xiaofei Hu, Qingxiang Li, Jiali Tian |
Jazyk: |
angličtina |
Rok vydání: |
2024 |
Předmět: |
|
Zdroj: |
Scientific Reports, Vol 14, Iss 1, Pp 1-12 (2024) |
Druh dokumentu: |
article |
ISSN: |
2045-2322 |
DOI: |
10.1038/s41598-024-78948-5 |
Popis: |
Abstract Social media data are characterized by significant noise and non-standardization, thereby posing challenges for existing methods in recognizing named entities owing to the entity sparsity and insufficient semantic richness. Thus, to deal with these issues, this study proposes SEMFF-NER, a named entity recognition (NER) method in social media texts that integrates multi-scale features and syntactic information. First, global features are extracted using a Transformer-based encoder (XLNET) with embedded dependency syntactic relations to enhance semantic representation. Next, sliding windows of different lengths capture local features, which are input into a bi-directional long short-term memory (BiLSTM) to capture multi-level local features. Subsequently, the fusion-attention mechanism effectively integrates global contextual information with multiple local features to predict the optimal entity labels. Extensive experiments conducted on three datasets collected from English social media platforms (WNUT2016, WNUT2017, OntoNotes5.0_English) demonstrate the advantageous performance of our proposed method, and ablation experiments further confirm the method’s viability and effectiveness. |
Databáze: |
Directory of Open Access Journals |
Externí odkaz: |
|
Nepřihlášeným uživatelům se plný text nezobrazuje |
K zobrazení výsledku je třeba se přihlásit.
|