Zobrazeno 1 - 10
of 477
pro vyhledávání: '"Tang, Xiangru"'
Autor:
Zhang, Fengji, Zhang, Zexian, Keung, Jacky Wai, Tang, Xiangru, Yang, Zhen, Yu, Xiao, Hu, Wenhua
Code Smell Detection (CSD) plays a crucial role in improving software quality and maintainability. And Deep Learning (DL) techniques have emerged as a promising approach for CSD due to their superior performance. However, the effectiveness of DL-base
Externí odkaz:
http://arxiv.org/abs/2406.19240
Autor:
Deng, Chunyuan, Zhao, Yilun, Heng, Yuzhao, Li, Yitong, Cao, Jiannan, Tang, Xiangru, Cohan, Arman
Data contamination has garnered increased attention in the era of large language models (LLMs) due to the reliance on extensive internet-derived training corpora. The issue of training corpus overlap with evaluation benchmarks--referred to as contami
Externí odkaz:
http://arxiv.org/abs/2406.14644
Autor:
Tang, Xiangru, Zhang, Xingyao, Shao, Yanjun, Wu, Jie, Zhao, Yilun, Cohan, Arman, Gong, Ming, Zhang, Dongmei, Gerstein, Mark
Large language models (LLMs) excel at a variety of natural language processing tasks, yet they struggle to generate personalized content for individuals, particularly in real-world scenarios like scientific writing. Addressing this challenge, we intr
Externí odkaz:
http://arxiv.org/abs/2406.14275
Multimodal Large Language Models (MLLMs) have seen growing adoption across various scientific disciplines. These advancements encourage the investigation of molecule-text modeling within synthetic chemistry, a field dedicated to designing and conduct
Externí odkaz:
http://arxiv.org/abs/2406.13193
Autor:
Biderman, Stella, Schoelkopf, Hailey, Sutawika, Lintang, Gao, Leo, Tow, Jonathan, Abbasi, Baber, Aji, Alham Fikri, Ammanamanchi, Pawan Sasanka, Black, Sidney, Clive, Jordan, DiPofi, Anthony, Etxaniz, Julen, Fattori, Benjamin, Forde, Jessica Zosa, Foster, Charles, Hsu, Jeffrey, Jaiswal, Mimansa, Lee, Wilson Y., Li, Haonan, Lovering, Charles, Muennighoff, Niklas, Pavlick, Ellie, Phang, Jason, Skowron, Aviya, Tan, Samson, Tang, Xiangru, Wang, Kevin A., Winata, Genta Indra, Yvon, François, Zou, Andy
Effective evaluation of language models remains an open challenge in NLP. Researchers and engineers face methodological issues such as the sensitivity of models to evaluation setup, difficulty of proper comparisons across methods, and the lack of rep
Externí odkaz:
http://arxiv.org/abs/2405.14782
Autor:
Deng, Chunyuan, Tang, Xiangru, Zhao, Yilun, Wang, Hanming, Wang, Haoran, Zhou, Wangchunshu, Cohan, Arman, Gerstein, Mark
Recently, large language models (LLMs) have evolved into interactive agents, proficient in planning, tool use, and task execution across a wide variety of tasks. However, without specific agent tuning, open-source models like LLaMA currently struggle
Externí odkaz:
http://arxiv.org/abs/2404.04285
Autor:
Lozhkov, Anton, Li, Raymond, Allal, Loubna Ben, Cassano, Federico, Lamy-Poirier, Joel, Tazi, Nouamane, Tang, Ao, Pykhtar, Dmytro, Liu, Jiawei, Wei, Yuxiang, Liu, Tianyang, Tian, Max, Kocetkov, Denis, Zucker, Arthur, Belkada, Younes, Wang, Zijian, Liu, Qian, Abulkhanov, Dmitry, Paul, Indraneil, Li, Zhuang, Li, Wen-Ding, Risdal, Megan, Li, Jia, Zhu, Jian, Zhuo, Terry Yue, Zheltonozhskii, Evgenii, Dade, Nii Osae Osae, Yu, Wenhao, Krauß, Lucas, Jain, Naman, Su, Yixuan, He, Xuanli, Dey, Manan, Abati, Edoardo, Chai, Yekun, Muennighoff, Niklas, Tang, Xiangru, Oblokulov, Muhtasham, Akiki, Christopher, Marone, Marc, Mou, Chenghao, Mishra, Mayank, Gu, Alex, Hui, Binyuan, Dao, Tri, Zebaze, Armel, Dehaene, Olivier, Patry, Nicolas, Xu, Canwen, McAuley, Julian, Hu, Han, Scholak, Torsten, Paquet, Sebastien, Robinson, Jennifer, Anderson, Carolyn Jane, Chapados, Nicolas, Patwary, Mostofa, Tajbakhsh, Nima, Jernite, Yacine, Ferrandis, Carlos Muñoz, Zhang, Lingming, Hughes, Sean, Wolf, Thomas, Guha, Arjun, von Werra, Leandro, de Vries, Harm
The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digita
Externí odkaz:
http://arxiv.org/abs/2402.19173
Autor:
Hong, Sirui, Lin, Yizhang, Liu, Bang, Liu, Bangbang, Wu, Binhao, Li, Danyang, Chen, Jiaqi, Zhang, Jiayi, Wang, Jinlin, Zhang, Li, Zhang, Lingyao, Yang, Min, Zhuge, Mingchen, Guo, Taicheng, Zhou, Tuo, Tao, Wei, Wang, Wenyi, Tang, Xiangru, Lu, Xiangtao, Zheng, Xiawu, Liang, Xinbing, Fei, Yaying, Cheng, Yuheng, Xu, Zongze, Wu, Chenglin
Large Language Model (LLM)-based agents have demonstrated remarkable effectiveness. However, their performance can be compromised in data science scenarios that require real-time data adjustment, expertise in optimization due to complex dependencies
Externí odkaz:
http://arxiv.org/abs/2402.18679
Autor:
Tang, Xiangru, Dai, Howard, Knight, Elizabeth, Wu, Fang, Li, Yunyang, Li, Tianxiao, Gerstein, Mark
Artificial intelligence (AI)-driven methods can vastly improve the historically costly drug design process, with various generative models already in widespread use. Generative models for de novo drug design, in particular, focus on the creation of n
Externí odkaz:
http://arxiv.org/abs/2402.08703
Autor:
Fang, Yin, Liu, Kangwei, Zhang, Ningyu, Deng, Xinle, Yang, Penghui, Chen, Zhuo, Tang, Xiangru, Gerstein, Mark, Fan, Xiaohui, Chen, Huajun
As Large Language Models (LLMs) rapidly evolve, their influence in science is becoming increasingly prominent. The emerging capabilities of LLMs in task generalization and free-form dialogue can significantly advance fields like chemistry and biology
Externí odkaz:
http://arxiv.org/abs/2402.08303