Zobrazeno 1 - 10
of 34 766
pro vyhledávání: '"Rangan, A"'
Autor:
Savage, Thomas, Ma, Stephen, Boukil, Abdessalem, Patel, Vishwesh, Rangan, Ekanath, Rodriguez, Ivan, Chen, Jonathan H
Large Language Model (LLM) fine tuning is underutilized in the field of medicine. Two of the most common methods of fine tuning are Supervised Fine Tuning (SFT) and Direct Preference Optimization (DPO), but there is little guidance informing users wh
Externí odkaz:
http://arxiv.org/abs/2409.12741
Autor:
Chen, Qi, Geng, Xiubo, Rosset, Corby, Buractaon, Carolyn, Lu, Jingwen, Shen, Tao, Zhou, Kun, Xiong, Chenyan, Gong, Yeyun, Bennett, Paul, Craswell, Nick, Xie, Xing, Yang, Fan, Tower, Bryan, Rao, Nikhil, Dong, Anlei, Jiang, Wenqi, Liu, Zheng, Li, Mingqin, Liu, Chuanjie, Li, Zengzhong, Majumder, Rangan, Neville, Jennifer, Oakley, Andy, Risvik, Knut Magne, Simhadri, Harsha Vardhan, Varma, Manik, Wang, Yujing, Yang, Linjun, Yang, Mao, Zhang, Ce
Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked
Externí odkaz:
http://arxiv.org/abs/2405.07526
Autor:
McGrouther, Caroline C., Rangan, Aaditya V., Di Florio, Arianna, Elman, Jeremy A., Schork, Nicholas J., Kelsoe, John
Bipolar Disorder (BD) is a complex disease. It is heterogeneous, both at the phenotypic and genetic level, although the extent and impact of this heterogeneity is not fully understood. In this paper, we leverage recent advances in heterogeneity analy
Externí odkaz:
http://arxiv.org/abs/2405.00159
The upper mid-band (FR3) has been recently attracting interest for new generation of mobile networks, as it provides a promising balance between spectrum availability and coverage, which are inherent limitations of the sub 6GHz and millimeter wave ba
Externí odkaz:
http://arxiv.org/abs/2404.17069
Autor:
Suri, Siddharth, Counts, Scott, Wang, Leijie, Chen, Chacha, Wan, Mengting, Safavi, Tara, Neville, Jennifer, Shah, Chirag, White, Ryen W., Andersen, Reid, Buscher, Georg, Manivannan, Sathish, Rangan, Nagu, Yang, Longqi
Until recently, search engines were the predominant method for people to access online information. The recent emergence of large language models (LLMs) has given machines new capabilities such as the ability to generate new digital artifacts like te
Externí odkaz:
http://arxiv.org/abs/2404.04268
Autor:
Wan, Mengting, Safavi, Tara, Jauhar, Sujay Kumar, Kim, Yujin, Counts, Scott, Neville, Jennifer, Suri, Siddharth, Shah, Chirag, White, Ryen W, Yang, Longqi, Andersen, Reid, Buscher, Georg, Joshi, Dhruv, Rangan, Nagu
Transforming unstructured text into structured and meaningful forms, organized by useful category labels, is a fundamental step in text mining for downstream analysis and application. However, most existing methods for producing label taxonomies and
Externí odkaz:
http://arxiv.org/abs/2403.12173
Autor:
Rangan, Keshav, Yin, Yiqiao
This study presents an innovative enhancement to retrieval-augmented generation (RAG) systems by seamlessly integrating fine-tuned large language models (LLMs) with vector databases. This integration capitalizes on the combined strengths of structure
Externí odkaz:
http://arxiv.org/abs/2402.17081
This technical report presents the training methodology and evaluation results of the open-source multilingual E5 text embedding models, released in mid-2023. Three embedding models of different sizes (small / base / large) are provided, offering a b
Externí odkaz:
http://arxiv.org/abs/2402.05672
Autor:
Giuliani, Amedeo, Nikbakht, Rasoul, Geraci, Giovanni, Kang, Seongjoon, Lozano, Angel, Rangan, Sundeep
This article proposes a generative neural network architecture for spatially consistent air-to-ground channel modeling. The approach considers the trajectories of uncrewed aerial vehicles along typical urban paths, capturing spatial dependencies with
Externí odkaz:
http://arxiv.org/abs/2402.03517
In this paper, we introduce a novel and simple method for obtaining high-quality text embeddings using only synthetic data and less than 1k training steps. Unlike existing methods that often depend on multi-stage intermediate pre-training with billio
Externí odkaz:
http://arxiv.org/abs/2401.00368