Zobrazeno 1 - 10
of 29 706
pro vyhledávání: '"Speech understanding"'
The success of large language models (LLMs) has prompted efforts to integrate speech and audio data, aiming to create general foundation models capable of processing both textual and non-textual inputs. Recent advances, such as GPT-4o, highlight the
Externí odkaz:
http://arxiv.org/abs/2410.13268
Automatic Speech Understanding (ASU) aims at human-like speech interpretation, providing nuanced intent, emotion, sentiment, and content understanding from speech and language (text) content conveyed in speech. Typically, training a robust ASU model
Externí odkaz:
http://arxiv.org/abs/2404.17983
Autor:
Cai, Dongqi
Speech is a common input method for mobile embedded devices, but cloud-based speech recognition systems pose privacy risks. Disentanglement-based encoders, designed to safeguard user privacy by filtering sensitive information from speech signals, unf
Externí odkaz:
http://arxiv.org/abs/2401.11983
Autor:
Wang, Rongxiang, Lin, Felix Xiaozhu
Modern speech understanding (SU) runs a sophisticated pipeline: ingesting streaming voice input, the pipeline executes encoder-decoder based deep neural networks repeatedly; by doing so, the pipeline generates tentative outputs (called hypotheses), a
Externí odkaz:
http://arxiv.org/abs/2311.17065
Autor:
Patel, Mitsoo K.1
Publikováno v:
University of Chicago Law Review. 2024Special, p567-603. 37p.
This paper addresses spoken language understanding (SLU) on microcontroller-like embedded devices, integrating on-device execution with cloud offloading in a novel fashion. We leverage temporal locality in the speech inputs to a device and reuse rece
Externí odkaz:
http://arxiv.org/abs/2311.18188
Autor:
Czurda, Ronja1 (AUTHOR) ronja.czurda@uniklinik-freiburg.de, Wesarg, Thomas1 (AUTHOR) antje.aschendorff@uniklinik-freiburg.de, Aschendorff, Antje1 (AUTHOR) rainer.beck@uniklinik-freiburg.de, Beck, Rainer Linus1 (AUTHOR) manuel.christoph.ketterer@uniklinik-freiburg.de, Hocke, Thomas2 (AUTHOR) thocke@cochlear.com, Ketterer, Manuel Christoph1 (AUTHOR) susan.arndt@uniklinik-freiburg.de, Arndt, Susan1 (AUTHOR)
Publikováno v:
Journal of Clinical Medicine. Feb2024, Vol. 13 Issue 3, p646. 12p.
Autor:
Dennison, Stephen R.1 (AUTHOR) srdennison@wisc.edu, Thakkar, Tanvi2 (AUTHOR) tthakkar@uwlax.edu, Kan, Alan3 (AUTHOR) alan.kan@mq.edu.au, Svirsky, Mario A.4 (AUTHOR) mario.svirsky@nyulangone.org, Azadpour, Mahan4 (AUTHOR) mahan.azadpour@nyulangone.org, Litovsky, Ruth Y.1 (AUTHOR) ruth.litovsky@wisc.edu
Publikováno v:
Journal of Clinical Medicine. Apr2024, Vol. 13 Issue 7, p1917. 17p.
Autor:
Tran, Minh, Soleymani, Mohammad
Existing privacy-preserving speech representation learning methods target a single application domain. In this paper, we present a novel framework to anonymize utterance-level speech embeddings generated by pre-trained encoders and show its effective
Externí odkaz:
http://arxiv.org/abs/2310.17194
Humans are surrounded by audio signals that include both speech and non-speech sounds. The recognition and understanding of speech and non-speech audio events, along with a profound comprehension of the relationship between them, constitute fundament
Externí odkaz:
http://arxiv.org/abs/2309.14405