Zobrazeno 1 - 10
of 364
pro vyhledávání: '"Ma, Mingyu"'
We seek to address a core challenge facing current Large Language Models (LLMs). LLMs have demonstrated superior performance in many tasks, yet continue to struggle with reasoning problems on explicit graphs that require multiple steps. To address th
Externí odkaz:
http://arxiv.org/abs/2410.22597
Existing retrieval-based reasoning approaches for large language models (LLMs) heavily rely on the density and quality of the non-parametric knowledge source to provide domain knowledge and explicit reasoning chain. However, inclusive knowledge sourc
Externí odkaz:
http://arxiv.org/abs/2410.08475
Large language models (LLMs) are increasingly applied to clinical decision-making. However, their potential to exhibit bias poses significant risks to clinical equity. Currently, there is a lack of benchmarks that systematically evaluate such clinica
Externí odkaz:
http://arxiv.org/abs/2407.05250
Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employi
Externí odkaz:
http://arxiv.org/abs/2407.01231
Autor:
Ma, Mingyu Derek, Ye, Chenchen, Yan, Yu, Wang, Xiaoxuan, Ping, Peipei, Chang, Timothy S, Wang, Wei
The integration of Artificial Intelligence (AI), especially Large Language Models (LLMs), into the clinical diagnosis process offers significant potential to improve the efficiency and accessibility of medical care. While LLMs have shown some promise
Externí odkaz:
http://arxiv.org/abs/2406.09923
Autor:
Wang, Fei, Fu, Xingyu, Huang, James Y., Li, Zekun, Liu, Qin, Liu, Xiaogeng, Ma, Mingyu Derek, Xu, Nan, Zhou, Wenxuan, Zhang, Kai, Yan, Tianyi Lorena, Mo, Wenjie Jacky, Liu, Hsiang-Hui, Lu, Pan, Li, Chunyuan, Xiao, Chaowei, Chang, Kai-Wei, Roth, Dan, Zhang, Sheng, Poon, Hoifung, Chen, Muhao
We introduce MuirBench, a comprehensive benchmark that focuses on robust multi-image understanding capabilities of multimodal LLMs. MuirBench consists of 12 diverse multi-image tasks (e.g., scene understanding, ordering) that involve 10 categories of
Externí odkaz:
http://arxiv.org/abs/2406.09411
Autor:
Cai, Zefan, Kung, Po-Nien, Suvarna, Ashima, Ma, Mingyu Derek, Bansal, Hritik, Chang, Baobao, Brantingham, P. Jeffrey, Wang, Wei, Peng, Nanyun
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In
Externí odkaz:
http://arxiv.org/abs/2403.02586
The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their licen
Externí odkaz:
http://arxiv.org/abs/2401.12255
Autor:
Liu, Yanchen, Ma, Mingyu Derek, Qin, Wenna, Zhou, Azure, Chen, Jiaao, Shi, Weiyan, Wang, Wei, Yang, Diyi
Susceptibility to misinformation describes the degree of belief in unverifiable claims, a latent aspect of individuals' mental processes that is not observable. Existing susceptibility studies heavily rely on self-reported beliefs, which can be subje
Externí odkaz:
http://arxiv.org/abs/2311.09630
Autor:
Ma, Mingyu Derek, Kao, Jiun-Yu, Gupta, Arpit, Lin, Yu-Hsiang, Zhao, Wenbo, Chung, Tagyoung, Wang, Wei, Chang, Kai-Wei, Peng, Nanyun
Models of various NLP tasks have been shown to exhibit stereotypes, and the bias in the question answering (QA) models is especially harmful as the output answers might be directly consumed by the end users. There have been datasets to evaluate bias
Externí odkaz:
http://arxiv.org/abs/2310.08795