SS-BERT: A Semantic Information Selecting Approach for Open-Domain Question Answering

Autor:	Xuan Fu, Jiangnan Du, Hai-Tao Zheng, Jianfeng Li, Cuiqin Hou, Qiyu Zhou, Hong-Gee Kim
Rok vydání:	2023
Předmět:	open-domain question answering passage rerank data augmentation negative selection BERT Computer Networks and Communications Hardware and Architecture Control and Systems Engineering Signal Processing Electrical and Electronic Engineering
Zdroj:	Electronics; Volume 12; Issue 7; Pages: 1692
ISSN:	2079-9292
DOI:	10.3390/electronics12071692
Popis:	Open-Domain Question Answering (Open-Domain QA) aims to answer any factoid questions from users. Recent progress in Open-Domain QA adopts the “retriever-reader” structure, which has proven effective. Retriever methods are mainly categorized as sparse retrievers and dense retrievers. In recent work, the dense retriever showed a stronger semantic interpretation than the sparse retriever. When training a dual-encoder dense retriever for document retrieval and reranking, there are two challenges: negative selection and a lack of training data. In this study, we make three major contributions to this topic: negative selection by query generation, data augmentation from negatives, and a passage evaluation method. We prove that the model performs better by focusing on false negatives and data augmentation in the Open-Domain QA passage rerank task. Our model outperforms other single dual-encoder rerankers over BERT-base and BM25 by 0.7 in MRR@10, achieving the highest Recall@50 and the max Recall@1000, which is restricted by the BM25 retrieval results.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9d3ea5aedc4c3893f5b07ac78a6b95cb https://doi.org/10.3390/electronics12071692 Zobrazit plný text záznamu