Výsledky vyhledávání - "Loweimi, Erfan"

Report

Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs

Autor: Loweimi, Erfan, Carmantini, Andrea, Bell, Peter, Renals, Steve, Cvetkovic, Zoran

In this paper, we analyse the error patterns of the raw waveform acoustic models in TIMIT's phone recognition task. Our analysis goes beyond the conventional phone error rate (PER) metric. We categorise the phones into three groups: {affricate, dipht

Externí odkaz: http://arxiv.org/abs/2406.00898

Zobrazit plný text záznamu

Report

Zero-shot Audio Topic Reranking using Large Language Models

Autor: Qian, Mengjie, Ma, Rao, Liusie, Adian, Loweimi, Erfan, Knill, Kate M., Gales, Mark J. F.

Multimodal Video Search by Examples (MVSE) investigates using video clips as the query term for information retrieval, rather than the more traditional text query. This enables far richer search modalities such as images, speaker, content, topic, and

Externí odkaz: http://arxiv.org/abs/2309.07606

Zobrazit plný text záznamu

Report

RCT: Random Consistency Training for Semi-supervised Sound Event Detection

Autor: Shao, Nian, Loweimi, Erfan, Li, Xiaofei

Sound event detection (SED), as a core module of acoustic environmental analysis, suffers from the problem of data deficiency. The integration of semi-supervised learning (SSL) largely mitigates such problem while bringing no extra annotation budget.

Externí odkaz: http://arxiv.org/abs/2110.11144

Zobrazit plný text záznamu

Report

Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers

Autor: Zhang, Shucong, Do, Cong-Thanh, Doddipatla, Rama, Loweimi, Erfan, Bell, Peter, Renals, Steve

Although the lower layers of a deep neural network learn features which are transferable across datasets, these layers are not transferable within the same dataset. That is, in general, freezing the trained feature extractor (the lower layers) and re

Externí odkaz: http://arxiv.org/abs/2102.04697

Zobrazit plný text záznamu

Report

On the Usefulness of Self-Attention for Automatic Speech Recognition with Transformers

Autor: Zhang, Shucong, Loweimi, Erfan, Bell, Peter, Renals, Steve

Self-attention models such as Transformers, which can capture temporal relationships without being limited by the distance between events, have given competitive speech recognition results. However, we note the range of the learned context increases

Externí odkaz: http://arxiv.org/abs/2011.04906

Zobrazit plný text záznamu

Report

Stochastic Attention Head Removal: A simple and effective method for improving Transformer Based ASR Models

Autor: Zhang, Shucong, Loweimi, Erfan, Bell, Peter, Renals, Steve

Recently, Transformer based models have shown competitive automatic speech recognition (ASR) performance. One key factor in the success of these models is the multi-head attention mechanism. However, for trained models, we have previously observed th

Externí odkaz: http://arxiv.org/abs/2011.04004

Zobrazit plný text záznamu

Report

When Can Self-Attention Be Replaced by Feed Forward Layers?

Autor: Zhang, Shucong, Loweimi, Erfan, Bell, Peter, Renals, Steve

Recently, self-attention models such as Transformers have given competitive results compared to recurrent neural network systems in speech recognition. The key factor for the outstanding performance of self-attention models is their ability to captur

Externí odkaz: http://arxiv.org/abs/2005.13895

Zobrazit plný text záznamu

Dissertation/ Thesis

Robust phase-based speech signal processing from source-filter separation to model-based robust ASR

Autor: Loweimi, Erfan

The Fourier analysis plays a key role in speech signal processing. As a complex quantity, it can be expressed in the polar form using the magnitude and phase spectra. The magnitude spectrum is widely used in almost every corner of speech processing.

Externí odkaz: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.736567

Zobrazit plný text záznamu

Report

Acoustic Model Adaptation from Raw Waveforms with SincNet

Autor: Fainberg, Joachim, Klejch, Ondřej, Loweimi, Erfan, Bell, Peter, Renals, Steve

Raw waveform acoustic modelling has recently gained interest due to neural networks' ability to learn feature extraction, and the potential for finding better representations for a given scenario than hand-crafted features. SincNet has been proposed

Externí odkaz: http://arxiv.org/abs/1909.13759

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání