Zobrazeno 1 - 10
of 4 879
pro vyhledávání: '"Bohan P"'
In the present work we present Training Noise Token (TNT) Pruning for vision transformers. Our method relaxes the discrete token dropping condition to continuous additive noise, providing smooth optimization in training, while retaining discrete drop
Externí odkaz:
http://arxiv.org/abs/2411.18092
Autor:
Li, Dawei, Jiang, Bohan, Huang, Liangjie, Beigi, Alimohammad, Zhao, Chengshuai, Tan, Zhen, Bhattacharjee, Amrita, Jiang, Yuxuan, Chen, Canyu, Wu, Tianhao, Shu, Kai, Cheng, Lu, Liu, Huan
Assessment and evaluation have long been critical challenges in artificial intelligence (AI) and natural language processing (NLP). However, traditional methods, whether matching-based or embedding-based, often fall short of judging subtle attributes
Externí odkaz:
http://arxiv.org/abs/2411.16594
Autor:
Chen, Feng, Gou, Chenhui, Liu, Jing, Yang, Yang, Li, Zhaoyang, Zhang, Jiyuan, Sun, Zhenbang, Zhuang, Bohan, Wu, Qi
As multimodal large language models (MLLMs) advance rapidly, rigorous evaluation has become essential, providing further guidance for their development. In this work, we focus on a unified and robust evaluation of \textbf{vision perception} abilities
Externí odkaz:
http://arxiv.org/abs/2411.14725
This paper explores the optimal investment problem of a renewal risk model with generalized Erlang distributed interarrival times. We assume that the phases of the interarrival time can be observed. The price of the risky asset is driven by the CEV m
Externí odkaz:
http://arxiv.org/abs/2411.13111
Leveraging the flexible expressive ability of (Max)SMT and the powerful solving ability of SMT solvers, we propose a novel layout model named SMT-Layout. SMT-Layout is the first constraint-based layout model that can support real-time interaction for
Externí odkaz:
http://arxiv.org/abs/2411.12271
We study the design of transfer functions for volumetric rendering of magnetic resonance imaging (MRI) datasets of human hands. Human hands are anatomically complex, containing various organs within a limited space, which presents challenges for volu
Externí odkaz:
http://arxiv.org/abs/2411.18630
Blind image quality assessment (BIQA) serves as a fundamental task in computer vision, yet it often fails to consistently align with human subjective perception. Recent advances show that multi-scale evaluation strategies are promising due to their a
Externí odkaz:
http://arxiv.org/abs/2411.09007
Autor:
Chen, Yuedong, Zheng, Chuanxia, Xu, Haofei, Zhuang, Bohan, Vedaldi, Andrea, Cham, Tat-Jen, Cai, Jianfei
We introduce MVSplat360, a feed-forward approach for 360{\deg} novel view synthesis (NVS) of diverse real-world scenes, using only sparse observations. This setting is inherently ill-posed due to minimal overlap among input views and insufficient vis
Externí odkaz:
http://arxiv.org/abs/2411.04924
Pre-training for Reinforcement Learning (RL) with purely video data is a valuable yet challenging problem. Although in-the-wild videos are readily available and inhere a vast amount of prior world knowledge, the absence of action annotations and the
Externí odkaz:
http://arxiv.org/abs/2411.03169
Autor:
Leng, Xinyi, Liang, Jason, Mauro, Jack, Wang, Xu, Bertozzi, Andrea L., Chapman, James, Lin, Junyuan, Chen, Bohan, Ye, Chenchen, Daniel, Temple, Brantingham, P. Jeffrey
Narrative data spans all disciplines and provides a coherent model of the world to the reader or viewer. Recent advancement in machine learning and Large Language Models (LLMs) have enable great strides in analyzing natural language. However, Large l
Externí odkaz:
http://arxiv.org/abs/2411.02435