Zobrazeno 1 - 10
of 6 754
pro vyhledávání: '"Thakker A"'
Autor:
Royce, Rob, Kaufmann, Marcel, Becktor, Jonathan, Moon, Sangwoo, Carpenter, Kalind, Pak, Kai, Towler, Amanda, Thakker, Rohan, Khattak, Shehryar
The advancement of robotic systems has revolutionized numerous industries, yet their operation often demands specialized technical knowledge, limiting accessibility for non-expert users. This paper introduces ROSA (Robot Operating System Agent), an A
Externí odkaz:
http://arxiv.org/abs/2410.06472
Formal theorem proving, a field at the intersection of mathematics and computer science, has seen renewed interest with advancements in large language models (LLMs). This paper introduces SubgoalXL, a novel approach that synergizes subgoal-based proo
Externí odkaz:
http://arxiv.org/abs/2408.11172
Large Language Models (LLMs) have revolutionized the landscape of machine learning, yet current benchmarks often fall short in capturing the diverse behavior of these models in real-world applications. A benchmark's usefulness is determined by its ab
Externí odkaz:
http://arxiv.org/abs/2408.08808
Disaster events often unfold rapidly, necessitating a swift and effective response. Developing action plans, resource allocation, and resolution of help requests in disaster scenarios is time-consuming and complex since disaster-relevant information
Externí odkaz:
http://arxiv.org/abs/2409.00004
Autor:
Wu, Haibin, Wang, Xiaofei, Eskimez, Sefik Emre, Thakker, Manthan, Tompkins, Daniel, Tsai, Chung-Hsien, Li, Canrun, Xiao, Zhen, Zhao, Sheng, Li, Jinyu, Kanda, Naoyuki
People change their tones of voice, often accompanied by nonverbal vocalizations (NVs) such as laughter and cries, to convey rich emotions. However, most text-to-speech (TTS) systems lack the capability to generate speech with rich emotions, includin
Externí odkaz:
http://arxiv.org/abs/2407.12229
Autor:
Eskimez, Sefik Emre, Wang, Xiaofei, Thakker, Manthan, Li, Canrun, Tsai, Chung-Hsien, Xiao, Zhen, Yang, Hemin, Zhu, Zirun, Tang, Min, Tan, Xu, Liu, Yanqing, Zhao, Sheng, Kanda, Naoyuki
This paper introduces Embarrassingly Easy Text-to-Speech (E2 TTS), a fully non-autoregressive zero-shot text-to-speech system that offers human-level naturalness and state-of-the-art speaker similarity and intelligibility. In the E2 TTS framework, th
Externí odkaz:
http://arxiv.org/abs/2406.18009
Autor:
Wang, Xiaofei, Eskimez, Sefik Emre, Thakker, Manthan, Yang, Hemin, Zhu, Zirun, Tang, Min, Xia, Yufei, Li, Jinzhu, Zhao, Sheng, Li, Jinyu, Kanda, Naoyuki
Recently, zero-shot text-to-speech (TTS) systems, capable of synthesizing any speaker's voice from a short audio prompt, have made rapid advancements. However, the quality of the generated speech significantly deteriorates when the audio prompt conta
Externí odkaz:
http://arxiv.org/abs/2406.05699
Autor:
Eskimez, Sefik Emre, Wang, Xiaofei, Thakker, Manthan, Tsai, Chung-Hsien, Li, Canrun, Xiao, Zhen, Yang, Hemin, Zhu, Zirun, Tang, Min, Li, Jinyu, Zhao, Sheng, Kanda, Naoyuki
Accurate control of the total duration of generated speech by adjusting the speech rate is crucial for various text-to-speech (TTS) applications. However, the impact of adjusting the speech rate on speech quality, such as intelligibility and speaker
Externí odkaz:
http://arxiv.org/abs/2406.04281
Autor:
Kureshi, Rameez Raja, Mishra, Bhupesh Kumar, Thakker, Dhavalkumar, Mazumdar, Suvodeep, Li, Xiao
The detrimental effects of air pollutants on human health have prompted increasing concerns regarding indoor air quality (IAQ). The emergence of digital health interventions and citizen science initiatives has provided new avenues for raising awarene
Externí odkaz:
http://arxiv.org/abs/2405.13064
Autor:
Prabhakar, Raghu, Sivaramakrishnan, Ram, Gandhi, Darshan, Du, Yun, Wang, Mingran, Song, Xiangyu, Zhang, Kejie, Gao, Tianren, Wang, Angela, Li, Karen, Sheng, Yongning, Brot, Joshua, Sokolov, Denis, Vivek, Apurv, Leung, Calvin, Sabnis, Arjun, Bai, Jiayu, Zhao, Tuowen, Gottscho, Mark, Jackson, David, Luttrell, Mark, Shah, Manish K., Chen, Edison, Liang, Kaizhao, Jain, Swayambhoo, Thakker, Urmish, Huang, Dawei, Jairath, Sumti, Brown, Kevin J., Olukotun, Kunle
Monolithic large language models (LLMs) like GPT-4 have paved the way for modern generative AI applications. Training, serving, and maintaining monolithic LLMs at scale, however, remains prohibitively expensive and challenging. The disproportionate i
Externí odkaz:
http://arxiv.org/abs/2405.07518