QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval

Autor:	Xi Wang, Peiyang Liu, Wei Ye, Shikun Zhang, Sen Wang
Rok vydání:	2021
Předmět:	Scale (ratio) Computer science business.industry InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL 02 engineering and technology 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Field (computer science) Variety (cybernetics) Ranking (information retrieval) 0202 electrical engineering electronic engineering information engineering Embedding 020201 artificial intelligence & image processing Language model Artificial intelligence business Focus (optics) computer 0105 earth and related environmental sciences
Zdroj:	NAACL-HLT
DOI:	10.18653/v1/2021.naacl-main.292
Popis:	The embedding-based large-scale query-document retrieval problem is a hot topic in the information retrieval (IR) field. Considering that pre-trained language models like BERT have achieved great success in a wide variety of NLP tasks, we present a QuadrupletBERT model for effective and efficient retrieval in this paper. Unlike most existing BERT-style retrieval models, which only focus on the ranking phase in retrieval systems, our model makes considerable improvements to the retrieval phase and leverages the distances between simple negative and hard negative instances to obtaining better embeddings. Experimental results demonstrate that our QuadrupletBERT achieves state-of-the-art results in embedding-based large-scale retrieval tasks.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::0dcde481e24fb544b4f60bfe16931960 https://doi.org/10.18653/v1/2021.naacl-main.292 Zobrazit plný text záznamu