Zobrazeno 1 - 10
of 10 805
pro vyhledávání: '"An, Baosheng"'
Autor:
Chen, Xiaohui, Shukla, Satya Narayan, Azab, Mahmoud, Singh, Aashu, Wang, Qifan, Yang, David, Peng, ShengYun, Yu, Hanchao, Yan, Shen, Zhang, Xuewen, He, Baosheng
How well can Multimodal Large Language Models (MLLMs) understand composite images? Composite images (CIs) are synthetic visuals created by merging multiple visual elements, such as charts, posters, or screenshots, rather than being captured directly
Externí odkaz:
http://arxiv.org/abs/2412.05243
Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology
Autor:
Xie, Wei, Ma, Shuoyoucheng, Wang, Zhenhua, Wang, Enze, Chen, Kai, Sun, Xiaobing, Wang, Baosheng
The cognitive mechanism by which Large Language Models (LLMs) solve mathematical problems remains a widely debated and unresolved issue. Currently, there is little interpretable experimental evidence that connects LLMs' problem-solving with human cog
Externí odkaz:
http://arxiv.org/abs/2410.14979
Retinal image registration plays an important role in the ophthalmological diagnosis process. Since there exist variances in viewing angles and anatomical structures across different retinal images, keypoint-based approaches become the mainstream met
Externí odkaz:
http://arxiv.org/abs/2409.01068
In recent years, the multimedia forensics and security community has seen remarkable progress in multitask learning for DeepFake (i.e., face forgery) detection. The prevailing strategy has been to frame DeepFake detection as a binary classification p
Externí odkaz:
http://arxiv.org/abs/2408.16305
In recent years, deep learning has greatly streamlined the process of generating realistic fake face images. Aware of the dangers, researchers have developed various tools to spot these counterfeits. Yet none asked the fundamental question: What digi
Externí odkaz:
http://arxiv.org/abs/2405.08487
Federated Learning (FL) seeks to train a model collaboratively without sharing private training data from individual clients. Despite its promise, FL encounters challenges such as high communication costs for large-scale models and the necessity for
Externí odkaz:
http://arxiv.org/abs/2404.08564
Autor:
Wang, Zhenhua, Xie, Wei, Wang, Baosheng, Wang, Enze, Gui, Zhiwen, Ma, Shuoyoucheng, Chen, Kai
Large Language Models (LLMs) have gradually become the gateway for people to acquire new knowledge. However, attackers can break the model's security protection ("jail") to access restricted information, which is called "jailbreaking." Previous studi
Externí odkaz:
http://arxiv.org/abs/2402.15690
The copilot framework, which aims to enhance and tailor large language models (LLMs) for specific complex tasks without requiring fine-tuning, is gaining increasing attention from the community. In this paper, we introduce the construction of a Healt
Externí odkaz:
http://arxiv.org/abs/2402.13408
Autor:
Psychogyios, Dimitrios, Colleoni, Emanuele, Van Amsterdam, Beatrice, Li, Chih-Yang, Huang, Shu-Yu, Li, Yuchong, Jia, Fucang, Zou, Baosheng, Wang, Guotai, Liu, Yang, Boels, Maxence, Huo, Jiayu, Sparks, Rachel, Dasgupta, Prokar, Granados, Alejandro, Ourselin, Sebastien, Xu, Mengya, Wang, An, Wu, Yanan, Bai, Long, Ren, Hongliang, Yamada, Atsushi, Harai, Yuriko, Ishikawa, Yuto, Hayashi, Kazuyuki, Simoens, Jente, DeBacker, Pieter, Cisternino, Francesco, Furnari, Gabriele, Mottrie, Alex, Ferraguti, Federica, Kondo, Satoshi, Kasai, Satoshi, Hirasawa, Kousuke, Kim, Soohee, Lee, Seung Hyun, Lee, Kyu Eun, Kong, Hyoun-Joong, Fu, Kui, Li, Chao, An, Shan, Krell, Stefanie, Bodenstedt, Sebastian, Ayobi, Nicolas, Perez, Alejandra, Rodriguez, Santiago, Puentes, Juanita, Arbelaez, Pablo, Mohareri, Omid, Stoyanov, Danail
Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition an
Externí odkaz:
http://arxiv.org/abs/2401.00496
Significant progress has been made recently in point cloud segmentation utilizing an encoder-decoder framework, which initially encodes point clouds into low-resolution representations and subsequently decodes high-resolution predictions. Inspired by
Externí odkaz:
http://arxiv.org/abs/2310.07743