Zobrazeno 1 - 10
of 45
pro vyhledávání: '"Zhang, Sai Qian"'
Autor:
Hsieh, He-Yen, Li, Ziyun, Zhang, Sai Qian, Ting, Wei-Te Mark, Chang, Kao-Den, De Salvo, Barbara, Liu, Chiao, Kung, H. T.
We present GazeGen, a user interaction system that generates visual content (images and videos) for locations indicated by the user's eye gaze. GazeGen allows intuitive manipulation of visual content by targeting regions of interest with gaze. Using
Externí odkaz:
http://arxiv.org/abs/2411.04335
Autor:
Augustin, Maximilian, Sarwar, Syed Shakib, Elhoushi, Mostafa, Zhang, Sai Qian, Li, Yuecheng, De Salvo, Barbara
Following their success in natural language processing (NLP), there has been a shift towards transformer models in computer vision. While transformers perform well and offer promising multi-tasking performance, due to their high compute requirements,
Externí odkaz:
http://arxiv.org/abs/2410.17661
Autor:
Zhao, Yiwei, Li, Ziyun, Khwa, Win-San, Sun, Xiaoyu, Zhang, Sai Qian, Sarwar, Syed Shakib, Stangherlin, Kleber Hugo, Lu, Yi-Lun, Gomez, Jorge Tomas, Seo, Jae-Sun, Gibbons, Phillip B., De Salvo, Barbara, Liu, Chiao
Low-Latency and Low-Power Edge AI is essential for Virtual Reality and Augmented Reality applications. Recent advances show that hybrid models, combining convolution layers (CNN) and transformers (ViT), often achieve superior accuracy/performance tra
Externí odkaz:
http://arxiv.org/abs/2410.08326
Autor:
Dong, Zhenyuan, Zhang, Sai Qian
Diffusion Transformers (DiTs) have recently attracted significant interest from both industry and academia due to their enhanced capabilities in visual generation, surpassing the performance of traditional diffusion models that employ U-Net. However,
Externí odkaz:
http://arxiv.org/abs/2409.07756
Speech-driven 3D motion synthesis seeks to create lifelike animations based on human speech, with potential uses in virtual reality, gaming, and the film production. Existing approaches reply solely on speech audio for motion generation, leading to i
Externí odkaz:
http://arxiv.org/abs/2408.12885
Large Language Models (LLMs) are effective in computer hardware synthesis via hardware description language (HDL) generation. However, LLM-assisted approaches for HDL generation struggle when handling complex tasks. We introduce a suite of hierarchic
Externí odkaz:
http://arxiv.org/abs/2407.18276
Autor:
Liu, Wenxuan, Zhang, Sai Qian
Diffusion Transformers (DiTs) have recently gained substantial attention in both industrial and academic fields for their superior visual generation capabilities, outperforming traditional diffusion models that use U-Net. However,the enhanced perform
Externí odkaz:
http://arxiv.org/abs/2405.19751
Autor:
Gao, Chao, Zhang, Sai Qian
To enhance the performance of large language models (LLM) on downstream tasks, one solution is to fine-tune certain LLM parameters and make it better align with the characteristics of the training dataset. This process is commonly known as parameter-
Externí odkaz:
http://arxiv.org/abs/2404.05182
Large models represent a groundbreaking advancement in multiple application fields, enabling remarkable achievements across various tasks. However, their unprecedented scale comes with significant computational costs. These models, often consisting o
Externí odkaz:
http://arxiv.org/abs/2403.14608
Autor:
Xia, Tianhua, Zhang, Sai Qian
The attention mechanism is a pivotal element within the transformer architecture, making a substantial contribution to its exceptional performance. Within this attention mechanism, Softmax is an imperative component that enables the model to assess t
Externí odkaz:
http://arxiv.org/abs/2311.13290