Zobrazeno 1 - 10
of 846
pro vyhledávání: '"Zhang, Hanlin"'
Autor:
Wang, Ziqi, Zhang, Hanlin, Li, Xiner, Huang, Kuan-Hao, Han, Chi, Ji, Shuiwang, Kakade, Sham M., Peng, Hao, Ji, Heng
Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness
Externí odkaz:
http://arxiv.org/abs/2407.01100
Autor:
Li, Jeffrey, Fang, Alex, Smyrnis, Georgios, Ivgi, Maor, Jordan, Matt, Gadre, Samir, Bansal, Hritik, Guha, Etash, Keh, Sedrick, Arora, Kushal, Garg, Saurabh, Xin, Rui, Muennighoff, Niklas, Heckel, Reinhard, Mercat, Jean, Chen, Mayee, Gururangan, Suchin, Wortsman, Mitchell, Albalak, Alon, Bitton, Yonatan, Nezhurina, Marianna, Abbas, Amro, Hsieh, Cheng-Yu, Ghosh, Dhruba, Gardner, Josh, Kilian, Maciej, Zhang, Hanlin, Shao, Rulin, Pratt, Sarah, Sanyal, Sunny, Ilharco, Gabriel, Daras, Giannis, Marathe, Kalyani, Gokaslan, Aaron, Zhang, Jieyu, Chandu, Khyathi, Nguyen, Thao, Vasiljevic, Igor, Kakade, Sham, Song, Shuran, Sanghavi, Sujay, Faghri, Fartash, Oh, Sewoong, Zettlemoyer, Luke, Lo, Kyle, El-Nouby, Alaaeldin, Pouransari, Hadi, Toshev, Alexander, Wang, Stephanie, Groeneveld, Dirk, Soldaini, Luca, Koh, Pang Wei, Jitsev, Jenia, Kollar, Thomas, Dimakis, Alexandros G., Carmon, Yair, Dave, Achal, Schmidt, Ludwig, Shankar, Vaishaal
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretrai
Externí odkaz:
http://arxiv.org/abs/2406.11794
Autor:
Brandfonbrener, David, Zhang, Hanlin, Kirsch, Andreas, Schwarz, Jonathan Richard, Kakade, Sham
Selecting high-quality data for pre-training is crucial in shaping the downstream task performance of language models. A major challenge lies in identifying this optimal subset, a problem generally considered intractable, thus necessitating scalable
Externí odkaz:
http://arxiv.org/abs/2406.10670
Retrieval-Augmented Generation (RAG) improves pre-trained models by incorporating external knowledge at test time to enable customized adaptation. We study the risk of datastore leakage in Retrieval-In-Context RAG Language Models (LMs). We show that
Externí odkaz:
http://arxiv.org/abs/2402.17840
Autor:
Zhang, Hanlin, Zhang, Yi-Fan, Yu, Yaodong, Madeka, Dhruv, Foster, Dean, Xing, Eric, Lakkaraju, Himabindu, Kakade, Sham
Accurate uncertainty quantification is crucial for the safe deployment of machine learning models, and prior research has demonstrated improvements in the calibration of modern language models (LMs). We study in-context learning (ICL), a prevalent me
Externí odkaz:
http://arxiv.org/abs/2312.04021
Autor:
Zhang, Hanlin, Edelman, Benjamin L., Francati, Danilo, Venturi, Daniele, Ateniese, Giuseppe, Barak, Boaz
Watermarking generative models consists of planting a statistical signal (watermark) in a model's output so that it can be later verified that the output was generated by the given model. A strong watermarking scheme satisfies the property that a com
Externí odkaz:
http://arxiv.org/abs/2311.04378
Autor:
Zhang, Hanlin, Cheng, Wenzheng
Gene essentiality refers to the degree to which a gene is necessary for the survival and reproductive efficacy of a living organism. Although the essentiality of non-coding genes has been documented, there are still aspects of non-coding genes' essen
Externí odkaz:
http://arxiv.org/abs/2309.10008
Pre-trained large language models (LMs) struggle to perform logical reasoning reliably despite advances in scale and compositionality. In this work, we tackle this challenge through the lens of symbolic programming. We propose DSR-LM, a Differentiabl
Externí odkaz:
http://arxiv.org/abs/2305.03742
Autor:
Pan, Alexander, Chan, Jun Shern, Zou, Andy, Li, Nathaniel, Basart, Steven, Woodside, Thomas, Ng, Jonathan, Zhang, Hanlin, Emmons, Scott, Hendrycks, Dan
Artificial agents have traditionally been trained to maximize reward, which may incentivize power-seeking and deception, analogous to how next-token prediction in language models (LMs) may incentivize toxicity. So do agents naturally learn to be Mach
Externí odkaz:
http://arxiv.org/abs/2304.03279
In the transfer-based adversarial attacks, adversarial examples are only generated by the surrogate models and achieve effective perturbation in the victim models. Although considerable efforts have been developed on improving the transferability of
Externí odkaz:
http://arxiv.org/abs/2303.15109