Empirical analysis of linguistic and paralinguistic information for automatic dialect classification
Autor: | S. S. Agrawal, Aruna Jain, Shweta Sinha |
---|---|
Rok vydání: | 2017 |
Předmět: |
Linguistics and Language
Computer science Speech recognition First language media_common.quotation_subject 02 engineering and technology computer.software_genre Paralanguage Language and Linguistics Rule-based machine translation Artificial Intelligence 020204 information systems Vowel Perception 0202 electrical engineering electronic engineering information engineering media_common business.industry Linguistics Focus (linguistics) Identification (information) Formant 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing |
Zdroj: | Artificial Intelligence Review. 51:647-672 |
ISSN: | 1573-7462 0269-2821 |
DOI: | 10.1007/s10462-017-9573-3 |
Popis: | Current research in automatic speech recognition is primarily concerned with the correct evaluation of linguistic information transmitted in the speech signal and with the identification of variations, naturally present in speech. These differences in speech may be due to the individual’s age; gender; or speaking style influenced by his dialect. Undoubtedly, the focus of research in this field is to strengthen further the techniques developed thus far, regarding their reliability and accuracy. The endeavour of this research paper is to primarily concentrate on analysis and modelling of linguistic and paralinguistic information embedded in the speech signal for discovering the similarities and dissimilarities among acoustic characteristics arising out of different dialects. This paper investigates the influence of dialectal variations, by measuring and analysing certain acoustic features such as formant frequencies, pitch, pitch slope, duration and intensity of vowel sounds. For automatic identification of native dialect, these differences are further exploited, given a sample of native speaker’s speech. For the classification of dialect in the spoken utterances support vector machines along with dialect-specific Gaussian mixture models were used. The system performance is compared with human perception of dialects. The proposed study focuses on various dialects of one of the world’s major language; Hindi. |
Databáze: | OpenAIRE |
Externí odkaz: |