Výsledky vyhledávání

Report

Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs

Autor: Loweimi, Erfan, Carmantini, Andrea, Bell, Peter, Renals, Steve, Cvetkovic, Zoran

In this paper, we analyse the error patterns of the raw waveform acoustic models in TIMIT's phone recognition task. Our analysis goes beyond the conventional phone error rate (PER) metric. We categorise the phones into three groups: {affricate, dipht

Externí odkaz: http://arxiv.org/abs/2406.00898

Zobrazit plný text záznamu

Report

Towards Robust Waveform-Based Acoustic Models

Autor: Oglic, Dino, Cvetkovic, Zoran, Sollich, Peter, Renals, Steve, Yu, Bin

Publikováno v: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022

We study the problem of learning robust acoustic models in adverse environments, characterized by a significant mismatch between training and test conditions. This problem is of paramount importance for the deployment of speech recognition systems th

Externí odkaz: http://arxiv.org/abs/2110.08634

Zobrazit plný text záznamu

Report

Automatic audiovisual synchronisation for ultrasound tongue imaging

Autor: Eshky, Aciel, Cleland, Joanne, Ribeiro, Manuel Sam, Sugden, Eleanor, Richmond, Korin, Renals, Steve

Ultrasound tongue imaging is used to visualise the intra-oral articulators during speech production. It is utilised in a range of applications, including speech and language therapy and phonetics research. Ultrasound and speech audio are recorded sim

Externí odkaz: http://arxiv.org/abs/2105.15162

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Report

Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors

Autor: Ribeiro, Manuel Sam, Cleland, Joanne, Eshky, Aciel, Richmond, Korin, Renals, Steve

Publikováno v: Speech Communication, Volume 128, April 2021, Pages 24-34

Speech sound disorders are a common communication impairment in childhood. Because speech disorders can negatively affect the lives and the development of children, clinical intervention is often recommended. To help with diagnosis and treatment, cli

Externí odkaz: http://arxiv.org/abs/2103.00324

Zobrazit plný text záznamu

Report

Silent versus modal multi-speaker speech recognition from ultrasound and video

Autor: Ribeiro, Manuel Sam, Eshky, Aciel, Richmond, Korin, Renals, Steve

We investigate multi-speaker speech recognition from ultrasound images of the tongue and video images of the lips. We train our systems on imaging data from modal speech, and evaluate on matched test sets of two speaking modes: silent and modal speec

Externí odkaz: http://arxiv.org/abs/2103.00333

Zobrazit plný text záznamu

Report

Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers

Autor: Zhang, Shucong, Do, Cong-Thanh, Doddipatla, Rama, Loweimi, Erfan, Bell, Peter, Renals, Steve

Although the lower layers of a deep neural network learn features which are transferable across datasets, these layers are not transferable within the same dataset. That is, in general, freezing the trained feature extractor (the lower layers) and re

Externí odkaz: http://arxiv.org/abs/2102.04697

Zobrazit plný text záznamu

Akademický článek

Risk of vaccine preventable diseases in UK migrants: A serosurvey and concordance analysis

Autor: Mayuri Gogoi, Christopher A. Martin, Paul W. Bird, Martin J. Wiselka, Judi Gardener, Kate Ellis, Valerie Renals, Adam J. Lewszuk, Sally Hargreaves, Manish Pareek

Publikováno v: Journal of Migration and Health, Vol 9, Iss , Pp 100217- (2024)

Background: Vaccine preventable diseases (VPDs) such as measles and rubella cause significant morbidity and mortality globally every year. The World Health Organization (WHO), reported vaccine coverage for both measles and rubella to be 71 % in 2019,

Externí odkaz: https://doaj.org/article/a2d3d709423d4c65b44b3ba999588595

Zobrazit plný text záznamu

Report

TaL: a synchronised multi-speaker corpus of ultrasound tongue imaging, audio, and lip videos

Autor: Ribeiro, Manuel Sam, Sanger, Jennifer, Zhang, Jing-Xuan, Eshky, Aciel, Wrench, Alan, Richmond, Korin, Renals, Steve

We present the Tongue and Lips corpus (TaL), a multi-speaker corpus of audio, ultrasound tongue imaging, and lip videos. TaL consists of two parts: TaL1 is a set of six recording sessions of one professional voice talent, a male native speaker of Eng

Externí odkaz: http://arxiv.org/abs/2011.09804

Zobrazit plný text záznamu

Report

On the Usefulness of Self-Attention for Automatic Speech Recognition with Transformers

Autor: Zhang, Shucong, Loweimi, Erfan, Bell, Peter, Renals, Steve

Self-attention models such as Transformers, which can capture temporal relationships without being limited by the distance between events, have given competitive speech recognition results. However, we note the range of the learned context increases

Externí odkaz: http://arxiv.org/abs/2011.04906

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání