A Kernel Wrapper for Phoneme Sequence Recognition

Autor:	Dan Chazan, Joseph Keshet
Přispěvatelé:	Keshet, Joseph, Bengio, Samy
Rok vydání:	2009
Předmět:	business.industry Computer science Speech recognition Acoustic model Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) Pattern recognition TIMIT Levenshtein distance ComputingMethodologies_ARTIFICIALINTELLIGENCE Support vector machine symbols.namesake ComputingMethodologies_PATTERNRECOGNITION Computer Science::Sound Kernel (statistics) Gaussian function symbols Edit distance Artificial intelligence business Classifier (UML) ComputingMethodologies_COMPUTERGRAPHICS
Zdroj:	Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods
DOI:	10.1002/9780470742044.ch5
Popis:	We describe a kernel wrapper, a Mercer kernel for the task of phoneme sequence recognition which is based on operations with the Gaussian kernel, and suitable for any sequence kernel classifier. We start by presenting a kernel-based algorithm for phoneme sequence recognition, which aims at minimizing the Levenshtein distance (edit distance) between the predicted phoneme sequence and the true phoneme sequence. Motivated by the good results of frame-based phoneme classification using SVMs with Gaussian kernel, we devised a kernel for speech utterances and phoneme sequences, which generalizes the kernel function for phoneme frame-based classification and adds timing constraints in the form of transitions and durations constraints. The kernel function has three parts corresponding to phoneme acoustic model, phoneme duration model and phoneme transition model. We present initial encouraging experimental results with the TIMIT corpus.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bb2b582ea912618cd7e0a93ec5c4e9ff https://doi.org/10.1002/9780470742044.ch5 Zobrazit plný text záznamu