A pattern-aware self-attention network for distant supervised relation extraction
Autor: | Heyan Huang, Yuming Shang, Xian-Ling Mao, Xin Sun, Wei Wei |
---|---|
Rok vydání: | 2022 |
Předmět: |
Information Systems and Management
Relation (database) business.industry Computer science Machine learning computer.software_genre Relationship extraction Computer Science Applications Theoretical Computer Science Constraint (information theory) Artificial Intelligence Control and Systems Engineering Probability distribution Artificial intelligence Language model Layer (object-oriented design) business computer Software Sentence Transformer (machine learning model) |
Zdroj: | Information Sciences. 584:269-279 |
ISSN: | 0020-0255 |
DOI: | 10.1016/j.ins.2021.10.047 |
Popis: | Distant supervised relation extraction is an efficient strategy of finding relational facts from unstructured text without labeled training data. A recent paradigm to develop relation extractors is using pre-trained Transformer language models to produce high-quality sentence representations. However, due to the original Transformer is weak at capturing local dependencies and phrasal structures, existing Transformer-based methods cannot identify various relational patterns in sentences. To address this issue, we propose a novel distant supervised relation extraction model, which employs a specific-designed pattern-aware self-attention network to automatically discover relational patterns for pre-trained Transformers in an end-to-end manner. Specifically, the proposed method assumes that the correlation between two adjacent tokens reflects the probability that they belong to the same pattern. Based on this assumption, a novel self-attention network is designed to generate the probability distribution of all patterns in a sentence. Then, the probability distribution is applied as a constraint in the first Transformer layer to encourage its attention heads to follow the relational pattern structures. As a result, fine-grained pattern information is enhanced in the pre-trained Transformer without losing global dependencies. Extensive experimental results on two popular benchmark datasets demonstrate that our model performs better than the state-of-the-art baselines. |
Databáze: | OpenAIRE |
Externí odkaz: |