Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Paul, Debalina Ghosh"'
With the rapid development of Large Language Models (LLMs), a large number of machine learning models have been developed to assist programming tasks including the generation of program code from natural language input. However, how to evaluate such
Externí odkaz:
http://arxiv.org/abs/2406.12655
In the scenario-based evaluation of machine learning models, a key problem is how to construct test datasets that represent various scenarios. The methodology proposed in this paper is to construct a benchmark and attach metadata to each test case. T
Externí odkaz:
http://arxiv.org/abs/2406.12635