Zobrazeno 1 - 10
of 2 307
pro vyhledávání: '"Tan, Hao"'
The intensity correlations due to imperfect modulation during the quantum-state preparation in a measurement-device-independent quantum key distribution (MDI QKD) system compromise its security performance. Therefore, it is crucial to assess the impa
Externí odkaz:
http://arxiv.org/abs/2408.08011
Multi-label image recognition is a fundamental task in computer vision. Recently, Vision-Language Models (VLMs) have made notable advancements in this area. However, previous methods fail to effectively leverage the rich knowledge in language models
Externí odkaz:
http://arxiv.org/abs/2407.20920
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models
Autor:
Nguyen, Minh, Dernoncourt, Franck, Yoon, Seunghyun, Deilamsalehy, Hanieh, Tan, Hao, Rossi, Ryan, Tran, Quan Hung, Bui, Trung, Nguyen, Thien Huu
We introduce an approach to identifying speaker names in dialogue transcripts, a crucial task for enhancing content accessibility and searchability in digital media archives. Despite the advancements in speech recognition, the task of text-based spea
Externí odkaz:
http://arxiv.org/abs/2407.12094
Autor:
Ding, Zhengqing, Cao, Juntian, Zhan, Kun, Chen, Yihang, Zhou, Lidan, Tan, Hao, Yang, Chenao, Yu, Ying, Niu, Zhichuan, Yu, Siyuan
Traditional Distributed Feedback (DFB) or Distributed Bragg Reflector (DBR) lasers typically utilize buried gratings as frequency-selective optical feedback mechanisms. However, the fabrication of such gratings often necessitates regrowth processes,
Externí odkaz:
http://arxiv.org/abs/2407.07690
The tolerance analysis of freeform surfaces plays a crucial role in the development of advanced imaging systems. However, the intricate relationship between surface error and imaging quality poses significant challenges, necessitating dense sampling
Externí odkaz:
http://arxiv.org/abs/2407.03688
Autor:
Xie, Desai, Bi, Sai, Shu, Zhixin, Zhang, Kai, Xu, Zexiang, Zhou, Yi, Pirk, Sören, Kaufman, Arie, Sun, Xin, Tan, Hao
We present LRM-Zero, a Large Reconstruction Model (LRM) trained entirely on synthesized 3D data, achieving high-quality sparse-view 3D reconstruction. The core of LRM-Zero is our procedural 3D dataset, Zeroverse, which is automatically synthesized fr
Externí odkaz:
http://arxiv.org/abs/2406.09371
Autor:
Zhang, Kai, Bi, Sai, Tan, Hao, Xiangli, Yuanbo, Zhao, Nanxuan, Sunkavalli, Kalyan, Xu, Zexiang
We propose GS-LRM, a scalable large reconstruction model that can predict high-quality 3D Gaussian primitives from 2-4 posed sparse images in 0.23 seconds on single A100 GPU. Our model features a very simple transformer-based architecture; we patchif
Externí odkaz:
http://arxiv.org/abs/2404.19702
Do vision-language models (VLMs) pre-trained to caption an image of a "durian" learn visual concepts such as "brown" (color) and "spiky" (texture) at the same time? We aim to answer this question as visual concepts learned "for free" would enable wid
Externí odkaz:
http://arxiv.org/abs/2404.12652
Autor:
Cao, Shengcao, Gu, Jiuxiang, Kuen, Jason, Tan, Hao, Zhang, Ruiyi, Zhao, Handong, Nenkova, Ani, Gui, Liang-Yan, Sun, Tong, Wang, Yu-Xiong
Open-world entity segmentation, as an emerging computer vision task, aims at segmenting entities in images without being restricted by pre-defined classes, offering impressive generalization capabilities on unseen images and concepts. Despite its pro
Externí odkaz:
http://arxiv.org/abs/2404.12386
Autor:
Wei, Xinyue, Zhang, Kai, Bi, Sai, Tan, Hao, Luan, Fujun, Deschaintre, Valentin, Sunkavalli, Kalyan, Su, Hao, Xu, Zexiang
We propose MeshLRM, a novel LRM-based approach that can reconstruct a high-quality mesh from merely four input images in less than one second. Different from previous large reconstruction models (LRMs) that focus on NeRF-based reconstruction, MeshLRM
Externí odkaz:
http://arxiv.org/abs/2404.12385