Zobrazeno 1 - 10
of 4 162
pro vyhledávání: '"ZHAO, HONGYU"'
Evaluating large language models (LLMs) is costly: it requires the generation and examination of LLM outputs on a large-scale benchmark of various tasks. This paper investigates how to efficiently reduce the tasks used to benchmark LLMs without affec
Externí odkaz:
http://arxiv.org/abs/2410.13804
Testing for differences in features between clusters in various applications often leads to inflated false positives when practitioners use the same dataset to identify clusters and then test features, an issue commonly known as ``double dipping''. T
Externí odkaz:
http://arxiv.org/abs/2410.06451
Publikováno v:
EMNLP 2024
The applications of large language models (LLMs) are promising for biomedical and healthcare research. Despite the availability of open-source LLMs trained using a wide range of biomedical data, current research on the applications of LLMs to genomic
Externí odkaz:
http://arxiv.org/abs/2406.15534
Autor:
Li, Ming, Chen, Pei, Wang, Chenguang, Zhao, Hongyu, Liang, Yijun, Hou, Yupeng, Liu, Fuxiao, Zhou, Tianyi
Finetuning large language models with a variety of instruction-response pairs has enhanced their capability to understand and follow instructions. Current instruction tuning primarily relies on teacher models or human intervention to generate and ref
Externí odkaz:
http://arxiv.org/abs/2405.13326
Autor:
Li, Ming, Zhang, Yong, He, Shwai, Li, Zhitao, Zhao, Hongyu, Wang, Jianzong, Cheng, Ning, Zhou, Tianyi
Instruction tuning is critical to improve LLMs but usually suffers from low-quality and redundant data. Data filtering for instruction tuning has proved important in improving both the efficiency and performance of the tuning process. But it also lea
Externí odkaz:
http://arxiv.org/abs/2402.00530
Autor:
Zhao, Hongyu, Tang, Zezhi, Li, Zhenhong, Dong, Yi, Si, Yuancheng, Lu, Mingyang, Panoutsos, George
The optimisation of crop harvesting processes for commonly cultivated crops is of great importance in the aim of agricultural industrialisation. Nowadays, the utilisation of machine vision has enabled the automated identification of crops, leading to
Externí odkaz:
http://arxiv.org/abs/2401.15785
A univariate continuous function can always be decomposed as the sum of a non-increasing function and a non-decreasing one. Based on this property, we propose a non-parametric regression method that combines two spline-fitted monotone curves. We demo
Externí odkaz:
http://arxiv.org/abs/2401.06383
Multi-ship tracking (MST) as a core technology has been proven to be applied to situational awareness at sea and the development of a navigational system for autonomous ships. Despite impressive tracking outcomes achieved by multi-object tracking (MO
Externí odkaz:
http://arxiv.org/abs/2310.05171
Discovering genes with similar functions across diverse biomedical contexts poses a significant challenge in gene representation learning due to data heterogeneity. In this study, we resolve this problem by introducing a novel model called Multimodal
Externí odkaz:
http://arxiv.org/abs/2310.02275
Model degrees of freedom ($\df$) is a fundamental concept in statistics because it quantifies the flexibility of a fitting procedure and is indispensable in model selection. To investigate the gap between $\df$ and the number of independent variables
Externí odkaz:
http://arxiv.org/abs/2308.13630