Zobrazeno 1 - 10
of 702
pro vyhledávání: '"Baldwin Timothy"'
Autor:
Li, Haonan, Han, Xudong, Zhai, Zenan, Mu, Honglin, Wang, Hao, Zhang, Zhenxuan, Geng, Yilin, Lin, Shom, Wang, Renxi, Shelmanov, Artem, Qi, Xiangyu, Wang, Yuxia, Hong, Donghai, Yuan, Youliang, Chen, Meng, Tu, Haoqin, Koto, Fajri, Kuribayashi, Tatsuki, Zeng, Cong, Bhardwaj, Rishabh, Zhao, Bingchen, Duan, Yawen, Liu, Yi, Alghamdi, Emad A., Yang, Yaodong, Dong, Yinpeng, Poria, Soujanya, Liu, Pengfei, Liu, Zhengzhong, Ren, Xuguang, Hovy, Eduard, Gurevych, Iryna, Nakov, Preslav, Choudhury, Monojit, Baldwin, Timothy
To address this gap, we introduce Libra-Leaderboard, a comprehensive framework designed to rank LLMs through a balanced evaluation of performance and safety. Combining a dynamic leaderboard with an interactive LLM arena, Libra-Leaderboard encourages
Externí odkaz:
http://arxiv.org/abs/2412.18551
Autor:
Mullappilly, Sahal Shaji, Kurpath, Mohammed Irfan, Pieri, Sara, Alseiari, Saeed Yahya, Cholakkal, Shanavas, Aldahmani, Khaled, Khan, Fahad, Anwer, Rao, Khan, Salman, Baldwin, Timothy, Cholakkal, Hisham
This paper introduces BiMediX2, a bilingual (Arabic-English) Bio-Medical EXpert Large Multimodal Model (LMM) with a unified architecture that integrates text and visual modalities, enabling advanced image understanding and medical applications. BiMed
Externí odkaz:
http://arxiv.org/abs/2412.07769
The growing use of large language models (LLMs) has raised concerns regarding their safety. While many studies have focused on English, the safety of LLMs in Arabic, with its linguistic and cultural complexities, remains under-explored. Here, we aim
Externí odkaz:
http://arxiv.org/abs/2410.17040
As large language models (LLMs) advance, their inability to autonomously execute tasks by directly interacting with external tools remains a critical limitation. Traditional methods rely on inputting tool descriptions as context, which is constrained
Externí odkaz:
http://arxiv.org/abs/2410.03439
Autor:
Li, Haonan, Han, Xudong, Wang, Hao, Wang, Yuxia, Wang, Minghan, Xing, Rui, Geng, Yilin, Zhai, Zenan, Nakov, Preslav, Baldwin, Timothy
We introduce Loki, an open-source tool designed to address the growing problem of misinformation. Loki adopts a human-centered approach, striking a balance between the quality of fact-checking and the cost of human involvement. It decomposes the fact
Externí odkaz:
http://arxiv.org/abs/2410.01794
Autor:
Vazhentsev, Artem, Fadeeva, Ekaterina, Xing, Rui, Panchenko, Alexander, Nakov, Preslav, Baldwin, Timothy, Panov, Maxim, Shelmanov, Artem
Uncertainty quantification (UQ) is a perspective approach to detecting Large Language Model (LLM) hallucinations and low quality output. In this work, we address one of the challenges of UQ in generation tasks that arises from the conditional depende
Externí odkaz:
http://arxiv.org/abs/2408.10692
This paper explores the task of automatic prediction of text spans in a legal problem description that support a legal area label. We use a corpus of problem descriptions written by laypeople in English that is annotated by practising lawyers. Inhere
Externí odkaz:
http://arxiv.org/abs/2408.02257
We propose selective debiasing -- an inference-time safety mechanism that aims to increase the overall quality of models in terms of prediction performance and fairness in the situation when re-training a model is prohibitive. The method is inspired
Externí odkaz:
http://arxiv.org/abs/2407.19345
Autor:
Yun, Sukmin, Lin, Haokun, Thushara, Rusiru, Bhat, Mohammad Qazim, Wang, Yongxin, Jiang, Zutao, Deng, Mingkai, Wang, Jinhong, Tao, Tianhua, Li, Junbo, Li, Haonan, Nakov, Preslav, Baldwin, Timothy, Liu, Zhengzhong, Xing, Eric P., Liang, Xiaodan, Shen, Zhiqiang
Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding webpage screens
Externí odkaz:
http://arxiv.org/abs/2406.20098
Autor:
Vashurin, Roman, Fadeeva, Ekaterina, Vazhentsev, Artem, Rvanova, Lyudmila, Tsvigun, Akim, Vasilev, Daniil, Xing, Rui, Sadallah, Abdelrahman Boda, Grishchenkov, Kirill, Petrakov, Sergey, Panchenko, Alexander, Baldwin, Timothy, Nakov, Preslav, Panov, Maxim, Shelmanov, Artem
Uncertainty quantification (UQ) is a critical component of machine learning (ML) applications. The rapid proliferation of large language models (LLMs) has stimulated researchers to seek efficient and effective approaches to UQ for text generation. As
Externí odkaz:
http://arxiv.org/abs/2406.15627