Zobrazeno 1 - 10
of 21 067
pro vyhledávání: '"An, Shuyu"'
Text-To-Image (TTI) generation is significant for controlled and diverse image generation with broad potential applications. Although current medical TTI methods have made some progress in report-to-Chest-Xray (CXR) generation, their generation perfo
Externí odkaz:
http://arxiv.org/abs/2410.20165
We present Agent S, an open agentic framework that enables autonomous interaction with computers through a Graphical User Interface (GUI), aimed at transforming human-computer interaction by automating complex, multi-step tasks. Agent S aims to addre
Externí odkaz:
http://arxiv.org/abs/2410.08164
By coupling unstable components, we demonstrate a novel approach that reduces static modulus to zero, eliminating causality-imposed absorption limitations in acoustics. Our heuristic model simulations achieve ultra-broadband absorption over 99% for w
Externí odkaz:
http://arxiv.org/abs/2410.06859
Autor:
Xu, Anfeng, Zhang, Biqiao, Kong, Shuyu, Huang, Yiteng, Yang, Zhaojun, Srivastava, Sangeeta, Sun, Ming
Keyword spotting (KWS) is an important speech processing component for smart devices with voice assistance capability. In this paper, we investigate if Kolmogorov-Arnold Networks (KAN) can be used to enhance the performance of KWS. We explore various
Externí odkaz:
http://arxiv.org/abs/2409.08605
Autor:
Wang, Zhenyu, Kong, Shuyu, Wan, Li, Zhang, Biqiao, Huang, Yiteng, Jin, Mumin, Sun, Ming, Lei, Xin, Yang, Zhaojun
Publikováno v:
INTERSPEECH 2024
Existing keyword spotting (KWS) systems primarily rely on predefined keyword phrases. However, the ability to recognize customized keywords is crucial for tailoring interactions with intelligent devices. In this paper, we present a novel Query-by-Exa
Externí odkaz:
http://arxiv.org/abs/2409.00099
Autor:
Lu, Tao, Wu, Muzhe, Lu, Xinyi, Xu, Siyuan, Zhan, Shuyu, Tambwekar, Anuj, Provost, Emily Mower
Harsh working environments and work-related stress have been known to contribute to mental health problems such as anxiety, depression, and suicidal ideation. As such, it is paramount to create solutions that can both detect employee unhappiness and
Externí odkaz:
http://arxiv.org/abs/2408.13473
Autor:
Liu, Xinyu, Shen, Shuyu, Li, Boyan, Ma, Peixian, Jiang, Runzhi, Luo, Yuyu, Zhang, Yuxin, Fan, Ju, Li, Guoliang, Tang, Nan
Translating users' natural language queries (NL) into SQL queries (i.e., NL2SQL) can significantly reduce barriers to accessing relational databases and support various commercial applications. The performance of NL2SQL has been greatly enhanced with
Externí odkaz:
http://arxiv.org/abs/2408.05109
Existing auto-regressive language models have demonstrated a remarkable capability to perform a new task with just a few examples in prompt, without requiring any additional training. In order to extend this capability to a multi-modal setting (i.e.
Externí odkaz:
http://arxiv.org/abs/2407.14875
In deep reinforcement learning applications, maximizing discounted reward is often employed instead of maximizing total reward to ensure the convergence and stability of algorithms, even though the performance metric for evaluating the policy remains
Externí odkaz:
http://arxiv.org/abs/2407.13279
Quantum singular value transformation (QSVT) is a framework that has been shown to unify many primitives in quantum algorithms. In this work, we leverage the QSVT framework in two directions. We first show that the QSVT framework can accelerate one r
Externí odkaz:
http://arxiv.org/abs/2407.11744