A review into deep learning techniques for spoken language identification.

Autor:	Thukroo, Irshad Ahmad, Bashir, Rumaan, Giri, Kaiser J.
Předmět:	DEEP learning ORAL communication NATURAL language processing ARTIFICIAL neural networks ARTIFICIAL intelligence PROGRAMMING languages SPEECH perception
Zdroj:	Multimedia Tools & Applications; Sep2022, Vol. 81 Issue 22, p32593-32624, 32p
Abstrakt:	Information Technology has touched new vistas for a couple of decades mostly to simplify the day-to-day life of the humans. One of the key contributions of Information Technology is the application of Artificial Intelligence to achieve better results. The advent of artificial intelligence has given rise to a new branch of Natural Language Processing (NLP) called Computational Linguistics, which generates frameworks for intelligently manipulating spoken language knowledge and has brought human-machine onto a new stage. In this context, speech has arisen to be one of the imperative forms of interfaces, which is the basic mode of communication for us, and generally the most preferred one. Language identification, being the front-end for various natural language processing tasks, plays an important role in language translation. Owing to this, the focus has been given on the field of speech recognition involving the identification & recognition of languages by a machine. Spoken language identification is the identification of language present in a speech segment despite its size (duration & speed), ambiance (topic & emotion), and moderator (gender, age, demographic region). This paper has investigated various existing spoken language identification models implemented using different deep learning approaches, datasets, and performance measures utilized for their analysis. It also highlights the main features and challenges faced by these models. A comprehensive comparative study of deep learning techniques has been carried out for spoken language identification. Moreover, this review analyzes the efficiency of the spoken language models that can help the researchers to propose new language identification models for speech signals. [ABSTRACT FROM AUTHOR]
Databáze:	Complementary Index
Externí odkaz:	Zobrazit plný text záznamu Full text from SpringerLink