Zobrazeno 1 - 10
of 739
pro vyhledávání: '"WILLIAMSON, DONALD A."'
Objective speech quality measures are typically used to assess speech enhancement algorithms, but it has been shown that they are sub-optimal as learning objectives because they do not always align well with human subjective ratings. This misalignmen
Externí odkaz:
http://arxiv.org/abs/2410.13182
Autor:
Kibria, Imran E, Williamson, Donald S.
Speech quality is best evaluated by human feedback using mean opinion scores (MOS). However, variance in ratings between listeners can introduce noise in the true quality label of an utterance. Currently, deep learning networks including convolutiona
Externí odkaz:
http://arxiv.org/abs/2410.12675
Autor:
Pias, Sabid Bin Habib, Freel, Alicia, Huang, Ran, Williamson, Donald, Kim, Minjeong, Kapadia, Apu
Voice Assistants (VAs) are popular for simple tasks, but users are often hesitant to use them for complex activities like online shopping. We explored whether the vocal characteristics like the VA's vocal tone, can make VAs perceived as more attracti
Externí odkaz:
http://arxiv.org/abs/2409.18941
Voice Assistants (VAs) can assist users in various everyday tasks, but many users are reluctant to rely on VAs for intricate tasks like online shopping. This study aims to examine whether the vocal characteristics of VAs can serve as an effective too
Externí odkaz:
http://arxiv.org/abs/2405.04791
Autor:
Pias, Sabid Bin Habib, Freel, Alicia, Trammel, Timothy, Akter, Taslima, Williamson, Donald, Kapadia, Apu
With the emergence of Artificial Intelligence (AI)-based decision-making, explanations help increase new technology adoption through enhanced trust and reliability. However, our experimental study challenges the notion that every user universally val
Externí odkaz:
http://arxiv.org/abs/2404.19629
Perceptual evaluation constitutes a crucial aspect of various audio-processing tasks. Full reference (FR) or similarity-based metrics rely on high-quality reference recordings, to which lower-quality or corrupted versions of the recording may be comp
Externí odkaz:
http://arxiv.org/abs/2310.09388
In contemporary society, voice-controlled devices, such as smartphones and home assistants, have become pervasive due to their advanced capabilities and functionality. The always-on nature of their microphones offers users the convenience of readily
Externí odkaz:
http://arxiv.org/abs/2309.15087
Autor:
Liu, Yuchen, Ong, Natasha, Peng, Kaiyan, Xiong, Bo, Wang, Qifan, Hou, Rui, Khabsa, Madian, Yang, Kaiyue, Liu, David, Williamson, Donald S., Yu, Hanchao
We present Multiscale Multiview Vision Transformers (MMViT), which introduces multiscale feature maps and multiview encodings to transformer models. Our model encodes different views of the input signal and builds several channel-resolution feature s
Externí odkaz:
http://arxiv.org/abs/2305.00104
Perceptually-inspired objective functions such as the perceptual evaluation of speech quality (PESQ), signal-to-distortion ratio (SDR), and short-time objective intelligibility (STOI), have recently been used to optimize performance of deep-learning-
Externí odkaz:
http://arxiv.org/abs/2303.13685
Dereverberation is often performed directly on the reverberant audio signal, without knowledge of the acoustic environment. Reverberation time, T60, however, is an essential acoustic factor that reflects how reverberation may impact a signal. In this
Externí odkaz:
http://arxiv.org/abs/2302.04932