Zobrazeno 1 - 10
of 27 292
pro vyhledávání: '"Zhang., Ning"'
Autor:
Zohar, Orr, Wang, Xiaohan, Dubois, Yann, Mehta, Nikhil, Xiao, Tong, Hansen-Estruch, Philippe, Yu, Licheng, Wang, Xiaofang, Juefei-Xu, Felix, Zhang, Ning, Yeung-Levy, Serena, Xia, Xide
Despite the rapid integration of video perception capabilities into Large Multimodal Models (LMMs), the underlying mechanisms driving their video understanding remain poorly understood. Consequently, many design decisions in this domain are made with
Externí odkaz:
http://arxiv.org/abs/2412.10360
We propose a sequential measurement protocol for accurate low-temperature estimation. The resulting correlated outputs significantly enhance the low temperature precision compared to that of the independent measurement scheme. This enhancement manife
Externí odkaz:
http://arxiv.org/abs/2412.04878
Autor:
Lai, Bolin, Juefei-Xu, Felix, Liu, Miao, Dai, Xiaoliang, Mehta, Nikhil, Zhu, Chenguang, Huang, Zeyi, Rehg, James M., Lee, Sangmin, Zhang, Ning, Xiao, Tong
Text-guided image manipulation has experienced notable advancement in recent years. In order to mitigate linguistic ambiguity, few-shot learning with visual examples has been applied for instructions that are underrepresented in the training set, or
Externí odkaz:
http://arxiv.org/abs/2412.01027
Autor:
Zhao, Shiyu, Wang, Zhenting, Juefei-Xu, Felix, Xia, Xide, Liu, Miao, Wang, Xiaofang, Liang, Mingfu, Zhang, Ning, Metaxas, Dimitris N., Yu, Licheng
Prevailing Multimodal Large Language Models (MLLMs) encode the input image(s) as vision tokens and feed them into the language backbone, similar to how Large Language Models (LLMs) process the text tokens. However, the number of vision tokens increas
Externí odkaz:
http://arxiv.org/abs/2412.00556
Autor:
He, Bin, Ying, Yuzhe, Shi, Yejiong, Meng, Zhe, Yin, Zichen, Chen, Zhengyu, Hu, Zhangwei, Xue, Ruizhi, Jing, Linkai, Lu, Yang, Sun, Zhenxing, Man, Weitao, Wu, Youtu, Lei, Dan, Zhang, Ning, Wang, Guihuai, Xue, Ping
Current surgical procedures for spinal cord tumors lack in vivo high-resolution, high-speed multifunctional imaging systems, posing challenges for precise tumor resection and intraoperative decision-making. This study introduces the Fast Adaptive Foc
Externí odkaz:
http://arxiv.org/abs/2410.21809
As large language model (LLM) agents increasingly integrate into our infrastructure, their robust coordination and message synchronization become vital. The Byzantine Generals Problem (BGP) is a critical model for constructing resilient multi-agent s
Externí odkaz:
http://arxiv.org/abs/2410.16237
Autor:
Liu, Han, Tang, Xianfeng, Chen, Tianlang, Liu, Jiapeng, Indu, Indu, Zou, Henry Peng, Dai, Peng, Galan, Roberto Fernandez, Porter, Michael D, Jia, Dongmei, Zhang, Ning, Xiong, Lian
The fashion industry is one of the leading domains in the global e-commerce sector, prompting major online retailers to employ recommendation systems for product suggestions and customer convenience. While recommendation systems have been widely stud
Externí odkaz:
http://arxiv.org/abs/2410.11327
Publikováno v:
in Proceedings of the 31st ACM International Conference on Multimedia, pp. 1431-1442, 2023
Traditional image codecs emphasize signal fidelity and human perception, often at the expense of machine vision tasks. Deep learning methods have demonstrated promising coding performance by utilizing rich semantic embeddings optimized for both human
Externí odkaz:
http://arxiv.org/abs/2410.06149
Autor:
Li, Zhangpu, Zou, Changhong, Ma, Suxue, Yang, Zhicheng, Du, Chen, Tang, Youbao, Cao, Zhenjie, Zhang, Ning, Lai, Jui-Hsin, Lin, Ruei-Sung, Ni, Yuan, Sun, Xingzhi, Xiao, Jing, Hou, Jieke, Zhang, Kai, Han, Mei
The rocketing prosperity of large language models (LLMs) in recent years has boosted the prevalence of vision-language models (VLMs) in the medical sector. In our online medical consultation scenario, a doctor responds to the texts and images provide
Externí odkaz:
http://arxiv.org/abs/2409.17610
Autor:
He, Zecheng, Sun, Bo, Juefei-Xu, Felix, Ma, Haoyu, Ramchandani, Ankit, Cheung, Vincent, Shah, Siddharth, Kalia, Anmol, Subramanyam, Harihar, Zareian, Alireza, Chen, Li, Jain, Ankit, Zhang, Ning, Zhang, Peizhao, Sumbaly, Roshan, Vajda, Peter, Sinha, Animesh
Diffusion models have demonstrated remarkable efficacy across various image-to-image tasks. In this research, we introduce Imagine yourself, a state-of-the-art model designed for personalized image generation. Unlike conventional tuning-based persona
Externí odkaz:
http://arxiv.org/abs/2409.13346