Zobrazeno 1 - 10
of 133
pro vyhledávání: '"Wang, Sida"'
Autor:
Lei, Fangyu, Chen, Jixuan, Ye, Yuxiao, Cao, Ruisheng, Shin, Dongchan, Su, Hongjin, Suo, Zhaoqing, Gao, Hongcheng, Hu, Wenjing, Yin, Pengcheng, Zhong, Victor, Xiong, Caiming, Sun, Ruoxi, Liu, Qian, Wang, Sida, Yu, Tao
Real-world enterprise text-to-SQL workflows often involve complex cloud or local data across various database systems, multiple SQL queries in various dialects, and diverse operations from data transformation to analytics. We introduce Spider 2.0, an
Externí odkaz:
http://arxiv.org/abs/2411.07763
Autor:
Yang, John, Jimenez, Carlos E., Zhang, Alex L., Lieret, Kilian, Yang, Joyce, Wu, Xindi, Press, Ori, Muennighoff, Niklas, Synnaeve, Gabriel, Narasimhan, Karthik R., Yang, Diyi, Wang, Sida I., Press, Ofir
Autonomous systems for software engineering are now capable of fixing bugs and developing features. These systems are commonly evaluated on SWE-bench (Jimenez et al., 2024a), which assesses their ability to solve software issues from GitHub repositor
Externí odkaz:
http://arxiv.org/abs/2410.03859
Autor:
Cao, Ruisheng, Lei, Fangyu, Wu, Haoyuan, Chen, Jixuan, Fu, Yeqiao, Gao, Hongcheng, Xiong, Xinzhuang, Zhang, Hanchong, Mao, Yuchen, Hu, Wenjing, Xie, Tianbao, Xu, Hongshen, Zhang, Danyang, Wang, Sida, Sun, Ruoxi, Yin, Pengcheng, Xiong, Caiming, Ni, Ansong, Liu, Qian, Zhong, Victor, Chen, Lu, Yu, Kai, Yu, Tao
Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based age
Externí odkaz:
http://arxiv.org/abs/2407.10956
Self-assembly of matter in solution generally relies on attractive interactions that overcome entropy and drive the formation of higher-order molecular and particulate structures. Such interactions play key roles in a variety of contexts, e.g., cryst
Externí odkaz:
http://arxiv.org/abs/2405.12099
Autor:
Jain, Naman, Han, King, Gu, Alex, Li, Wen-Ding, Yan, Fanjia, Zhang, Tianjun, Wang, Sida, Solar-Lezama, Armando, Sen, Koushik, Stoica, Ion
Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry. However, as new and improved LLMs are developed, existing evaluation benchmarks (e.g
Externí odkaz:
http://arxiv.org/abs/2403.07974
We introduce Syntax-Aware Fill-In-the-Middle (SAFIM), a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task. This benchmark focuses on syntax-aware completions of program structures such as code blocks
Externí odkaz:
http://arxiv.org/abs/2403.04814
Autor:
Gu, Alex, Rozière, Baptiste, Leather, Hugh, Solar-Lezama, Armando, Synnaeve, Gabriel, Wang, Sida I.
We present CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation), a benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an input-output pair, leading to two natural tasks: input prediction and output predi
Externí odkaz:
http://arxiv.org/abs/2401.03065
Autor:
Wang, Sida I.
The striking ability of unsupervised word translation has been demonstrated with the help of word vectors / pretraining; however, they require large amounts of data and usually fails if the data come from different domains. We propose coocmap, a meth
Externí odkaz:
http://arxiv.org/abs/2305.14200
Interactive semantic parsing based on natural language (NL) feedback, where users provide feedback to correct the parser mistakes, has emerged as a more practical scenario than the traditional one-shot semantic parsing. However, prior work has heavil
Externí odkaz:
http://arxiv.org/abs/2305.08195
Autor:
Ni, Ansong, Iyer, Srini, Radev, Dragomir, Stoyanov, Ves, Yih, Wen-tau, Wang, Sida I., Lin, Xi Victoria
The advent of large language models trained on code (code LLMs) has led to significant progress in language-to-code generation. State-of-the-art approaches in this area combine LLM decoding with sample pruning and reranking using test cases or heuris
Externí odkaz:
http://arxiv.org/abs/2302.08468