Zobrazeno 1 - 10
of 123
pro vyhledávání: '"Najork, Marc"'
Autor:
Chaudhary, Aditi, Raman, Karthik, Srinivasan, Krishna, Hashimoto, Kazuma, Bendersky, Mike, Najork, Marc
Query-document relevance prediction is a critical problem in Information Retrieval systems. This problem has increasingly been tackled using (pretrained) transformer-based models which are finetuned using large collections of labeled data. However, i
Externí odkaz:
http://arxiv.org/abs/2305.11944
Autor:
Zhang, Rongzhi, Shen, Jiaming, Liu, Tianqi, Liu, Jialu, Bendersky, Michael, Najork, Marc, Zhang, Chao
Knowledge distillation is a popular technique to transfer knowledge from large teacher models to a small student model. Typically, the student learns to imitate the teacher by minimizing the KL divergence of its output distribution with the teacher's
Externí odkaz:
http://arxiv.org/abs/2305.05010
Automatic headline generation enables users to comprehend ongoing news events promptly and has recently become an important task in web mining and natural language processing. With the growing need for news headline generation, we argue that the hall
Externí odkaz:
http://arxiv.org/abs/2302.05852
Autor:
Zhang, Yunan, Yan, Le, Qin, Zhen, Zhuang, Honglei, Shen, Jiaming, Wang, Xuanhui, Bendersky, Michael, Najork, Marc
Unbiased learning to rank (ULTR) studies the problem of mitigating various biases from implicit user feedback data such as clicks, and has been receiving considerable attention recently. A popular ULTR approach for real-world applications uses a two-
Externí odkaz:
http://arxiv.org/abs/2212.13937
Autor:
Mehta, Sanket Vaibhav, Gupta, Jai, Tay, Yi, Dehghani, Mostafa, Tran, Vinh Q., Rao, Jinfeng, Najork, Marc, Strubell, Emma, Metzler, Donald
Differentiable Search Indices (DSIs) encode a corpus of documents in model parameters and use the same model to answer user queries directly. Despite the strong performance of DSI models, deploying them in situations where the corpus changes over tim
Externí odkaz:
http://arxiv.org/abs/2212.09744
Autor:
Bai, Aijun, Jagerman, Rolf, Qin, Zhen, Yan, Le, Kar, Pratyush, Lin, Bing-Rong, Wang, Xuanhui, Bendersky, Michael, Najork, Marc
As Learning-to-Rank (LTR) approaches primarily seek to improve ranking quality, their output scores are not scale-calibrated by design. This fundamentally limits LTR usage in score-sensitive applications. Though a simple multi-objective approach that
Externí odkaz:
http://arxiv.org/abs/2211.01494
The pre-trained language model (eg, BERT) based deep retrieval models achieved superior performance over lexical retrieval models (eg, BM25) in many passage retrieval tasks. However, limited work has been done to generalize a deep retrieval model to
Externí odkaz:
http://arxiv.org/abs/2201.10582
Automating information extraction from form-like documents at scale is a pressing need due to its potential impact on automating business workflows across many industries like financial services, insurance, and healthcare. The key challenge is that f
Externí odkaz:
http://arxiv.org/abs/2201.02647
Autor:
Wang, Nan, Qin, Zhen, Yan, Le, Zhuang, Honglei, Wang, Xuanhui, Bendersky, Michael, Najork, Marc
Multiclass classification (MCC) is a fundamental machine learning problem of classifying each instance into one of a predefined set of classes. In the deep learning era, extensive efforts have been spent on developing more powerful neural embedding m
Externí odkaz:
http://arxiv.org/abs/2112.09727
Autor:
Qin, Zhen, Yan, Le, Tay, Yi, Zhuang, Honglei, Wang, Xuanhui, Bendersky, Michael, Najork, Marc
We explore a novel perspective of knowledge distillation (KD) for learning to rank (LTR), and introduce Self-Distilled neural Rankers (SDR), where student rankers are parameterized identically to their teachers. Unlike the existing ranking distillati
Externí odkaz:
http://arxiv.org/abs/2109.15285