Výsledky vyhledávání

Akademický článek

The Effect of Narrow-Band Transmission on Recognition of Paralinguistic Information From Human Vocalizations

Autor: Sascha Fruhholz, Erik Marchi, Bjorn Schuller

Publikováno v: IEEE Access, Vol 4, Pp 6059-6072 (2016)

Practically, no knowledge exists on the effects of speech coding and recognition for narrow-band transmission of speech signals within certain frequency ranges especially in relation to the recognition of paralinguistic cues in speech. We thus invest

Externí odkaz: https://doaj.org/article/2e85ee88be3c402c8b0e2847b00db947

Zobrazit plný text záznamu

Improving Voice Trigger Detection with Metric Learning

Autor: Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed Tewfik

Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword phrase. A detector is typically trained on speech data independent of speaker information and used for the voice trigger dete

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7d2247120c2b7942ea99df1878f165ef

Zobrazit plný text záznamu

Increasing Naturalness and Flexibility in Spoken Dialogue Interaction

Autor: Sabato Marco Siniscalchi, Sandro Cumani, Li Haizhou, Erik Marchi, Valerio Mario Salerno

Publikováno v: Lecture Notes in Electrical Engineering ISBN: 9789811593222

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::8daedc96270dcb0b92391f25d4cbdf72
https://doi.org/10.1007/978-981-15-9323-9

Zobrazit plný text záznamu

Whispered and Lombard Neural Speech Synthesis

Autor: Tuomo Raitio, Tobias Bleisch, Petko N. Petkov, Qiong Hu, Varun Lakshminarasimhan, Erik Marchi

Publikováno v: SLT

It is desirable for a text-to-speech system to take into account the environment where synthetic speech is presented, and provide appropriate context-dependent output to the user. In this paper, we present and compare various approaches for generatin

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::77e855d6f7336fc20fefeacc7f55259f

Zobrazit plný text záznamu

Progressive Voice Trigger Detection: Accuracy vs Latency

Autor: Erik Marchi, Hywel Richards, John Bridle, Vineet Garg, Pascal Clark, Siddharth Sigtia

Publikováno v: ICASSP

We present an architecture for voice trigger detection for virtual assistants. The main idea in this work is to exploit information in words that immediately follow the trigger phrase. We first demonstrate that by including more audio context after a

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7fa6202fbfe5a2ec003a8afb8cb11f2d
http://arxiv.org/abs/2010.15446

Zobrazit plný text záznamu

Face reading from speech - predicting facial action units from audio cues

Autor: Björn Schuller, Fabien Ringeval, Klaus R. Scherer, Erik Marchi, Marc Mehu

Publikováno v: Scopus-Elsevier
INTERSPEECH

The automatic recognition of facial behaviours is usually achieved through the detection of particular FACS Action Unit (AU), which then makes it possible to analyse the affective behaviours expressed in the face. Despite the fact that advanced techn

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::44eebc10c8185a9d33554cd369d0c2e5
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/77171

Zobrazit plný text záznamu

Generating Multilingual Voices Using Speaker Space Translation Based on Bilingual Speaker Data

Autor: Alistair Conkie, Soumi Maiti, Erik Marchi

Publikováno v: ICASSP

We present progress towards bilingual Text-to-Speech which is able to transform a monolingual voice to speak a second language while preserving speaker voice quality. We demonstrate that a bilingual speaker embedding space contains a separate distrib

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::698c74e9e8a040e13a94e5597f52a908
http://arxiv.org/abs/2004.04972

Zobrazit plný text záznamu

Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting

Autor: Stefano Squartini, Björn Schuller, Martin Wöllmer, Erik Marchi

Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today’s automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and mod

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::56499c4995c9664c6b0644aeb0bc330e
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/73303

Zobrazit plný text záznamu

Speech, emotion, age, language, task, and typicality: trying to disentangle performance and feature relevance

Autor: Shimrit Fridenzon, Erik Marchi, Ofer Golan, Björn Schuller, Anton Batliner, Shahar Tal

Publikováno v: SocialCom/PASSAT

The availability of speech corpora is positively correlated with typicality: The more typical the population is we draw our sample from, the easier it is to get enough data. The less typical the envisaged population is, the more difficult it is to ge

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d11ccadbd84cd35e55fe1aba96b61aa1
https://opus.bibliothek.uni-augsburg.de/opus4/frontdoor/index/index/docId/67975

Zobrazit plný text záznamu

On the Role of Visual Cues in Audiovisual Speech Enhancement

Autor: Zakaria Aldeneh, Ahmed Hussen Abdelaziz, Devang Naik, Erik Marchi, Barry-John Theobald, Anushree Prasanna Kumar, Sachin S. Kajarekar

Publikováno v: ICASSP

We present an introspection of an audiovisual speech enhancement model. In particular, we focus on interpreting how a neural audiovisual speech enhancement model uses visual cues to improve the quality of the target speech signal. We show that visual

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::363bfbf343bf136a979b620c28c30afb

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání