SS-BERT: A Semantic Information Selecting Approach for Open-Domain Question Answering

Autor: Xuan Fu, Jiangnan Du, Hai-Tao Zheng, Jianfeng Li, Cuiqin Hou, Qiyu Zhou, Hong-Gee Kim
Rok vydání: 2023
Předmět:
Zdroj: Electronics; Volume 12; Issue 7; Pages: 1692
ISSN: 2079-9292
DOI: 10.3390/electronics12071692
Popis: Open-Domain Question Answering (Open-Domain QA) aims to answer any factoid questions from users. Recent progress in Open-Domain QA adopts the “retriever-reader” structure, which has proven effective. Retriever methods are mainly categorized as sparse retrievers and dense retrievers. In recent work, the dense retriever showed a stronger semantic interpretation than the sparse retriever. When training a dual-encoder dense retriever for document retrieval and reranking, there are two challenges: negative selection and a lack of training data. In this study, we make three major contributions to this topic: negative selection by query generation, data augmentation from negatives, and a passage evaluation method. We prove that the model performs better by focusing on false negatives and data augmentation in the Open-Domain QA passage rerank task. Our model outperforms other single dual-encoder rerankers over BERT-base and BM25 by 0.7 in MRR@10, achieving the highest Recall@50 and the max Recall@1000, which is restricted by the BM25 retrieval results.
Databáze: OpenAIRE