Zobrazeno 1 - 10
of 45
pro vyhledávání: '"Erik Marchi"'
Publikováno v:
IEEE Access, Vol 4, Pp 6059-6072 (2016)
Practically, no knowledge exists on the effects of speech coding and recognition for narrow-band transmission of speech signals within certain frequency ranges especially in relation to the recognition of paralinguistic cues in speech. We thus invest
Externí odkaz:
https://doaj.org/article/2e85ee88be3c402c8b0e2847b00db947
Autor:
Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed Tewfik
Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword phrase. A detector is typically trained on speech data independent of speaker information and used for the voice trigger dete
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7d2247120c2b7942ea99df1878f165ef
Publikováno v:
Lecture Notes in Electrical Engineering ISBN: 9789811593222
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::8daedc96270dcb0b92391f25d4cbdf72
https://doi.org/10.1007/978-981-15-9323-9
https://doi.org/10.1007/978-981-15-9323-9
Autor:
Tuomo Raitio, Tobias Bleisch, Petko N. Petkov, Qiong Hu, Varun Lakshminarasimhan, Erik Marchi
Publikováno v:
SLT
It is desirable for a text-to-speech system to take into account the environment where synthetic speech is presented, and provide appropriate context-dependent output to the user. In this paper, we present and compare various approaches for generatin
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::77e855d6f7336fc20fefeacc7f55259f
Publikováno v:
ICASSP
We present an architecture for voice trigger detection for virtual assistants. The main idea in this work is to exploit information in words that immediately follow the trigger phrase. We first demonstrate that by including more audio context after a
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7fa6202fbfe5a2ec003a8afb8cb11f2d
http://arxiv.org/abs/2010.15446
http://arxiv.org/abs/2010.15446
Publikováno v:
Scopus-Elsevier
INTERSPEECH
INTERSPEECH
The automatic recognition of facial behaviours is usually achieved through the detection of particular FACS Action Unit (AU), which then makes it possible to analyse the affective behaviours expressed in the face. Despite the fact that advanced techn
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::44eebc10c8185a9d33554cd369d0c2e5
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/77171
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/77171
Publikováno v:
ICASSP
We present progress towards bilingual Text-to-Speech which is able to transform a monolingual voice to speak a second language while preserving speaker voice quality. We demonstrate that a bilingual speaker embedding space contains a separate distrib
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::698c74e9e8a040e13a94e5597f52a908
http://arxiv.org/abs/2004.04972
http://arxiv.org/abs/2004.04972
Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today’s automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and mod
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::56499c4995c9664c6b0644aeb0bc330e
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/73303
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/73303
Publikováno v:
SocialCom/PASSAT
The availability of speech corpora is positively correlated with typicality: The more typical the population is we draw our sample from, the easier it is to get enough data. The less typical the envisaged population is, the more difficult it is to ge
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d11ccadbd84cd35e55fe1aba96b61aa1
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/67975
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/67975
Autor:
Zakaria Aldeneh, Ahmed Hussen Abdelaziz, Devang Naik, Erik Marchi, Barry-John Theobald, Anushree Prasanna Kumar, Sachin S. Kajarekar
Publikováno v:
ICASSP
We present an introspection of an audiovisual speech enhancement model. In particular, we focus on interpreting how a neural audiovisual speech enhancement model uses visual cues to improve the quality of the target speech signal. We show that visual
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::363bfbf343bf136a979b620c28c30afb