Exploiting speech knowledge in neural nets for recognition

Autor:	M. Huckvale
Rok vydání:	1990
Předmět:	Linguistics and Language Artificial neural network Computer science Intelligent character recognition business.industry Communication Speech recognition Lexicon Speaker recognition computer.software_genre Language and Linguistics Computer Science Applications ComputingMethodologies_PATTERNRECOGNITION Modeling and Simulation Component (UML) Pattern recognition (psychology) Feature (machine learning) Speech analytics Computer Vision and Pattern Recognition Artificial intelligence business computer Software Natural language processing
Zdroj:	Speech Communication. 9:1-13
ISSN:	0167-6393
Popis:	This paper argues that neural networks are good vehicles for automatic speech recognition not simply because they provide non-linear pattern recognition but because their architecture allows the incorporation and exploitation of existing knowledge about speech. The paper is in two parts: Part I defends the need for the incorporation of existing knowledge while Part II sketches a speech recognition architecture that uses neural networks to represent and exploit existing phonological and linguistic knowledge. The first part of the paper argues that the definition of the speech recognition problem implies that prior knowledge of linguistic analysis is essential for its solution, and suggests that the currently poor exploitation of such knowledge is a consequence of contemporary pattern recognition architectures. Criticism is made of the current emphasis on syntctic pattern recognition algorithms operating at the level of the phonetic segment. The second part of the paper demonstrates that a network architecture for the lexicon provides a mechanism for the incorporation and exploitation of a range of phonological analyses. Furthermore, through the explicit separation of phonological representations from phonetic ones, there exists the possibility of constructing a front-end phonetic component on purely pattern recognition principles. Through normalisation of speaker and environment, the phonetic component may be interfaced to the network lexicon to provide a complete recognition architecture which avoids compromise in the exploitation of speech knowledge.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::862e0932ab139518e0592a920b7129ad https://doi.org/10.1016/0167-6393(90)90040-g Zobrazit plný text záznamu