Speaker ethnic identification for continuous speech in Malay language using pitch and MFCC

Autor:	Khalid Isa, Shamsul Mohamad, Rafizah Mohd Hanifa
Rok vydání:	2020
Předmět:	Control and Optimization Support vector machine Computer Networks and Communications Computer science Speech recognition Feature extraction 02 engineering and technology Malay language Ethnic identification Naive Bayes classifier Mfcc Stress (linguistics) 0202 electrical engineering electronic engineering information engineering Electrical and Electronic Engineering Malay 020206 networking & telecommunications language.human_language Tree (data structure) Hardware and Architecture Signal Processing language 020201 artificial intelligence & image processing Mel-frequency cepstrum Information Systems
Popis:	Voice recognition has evolved exponentially over the years. The purpose of voice recognition or sometimes called speaker identification, is to identify the person who is speaking. This can be done by extracting features of speech that differ between individuals due to physiology (shape and size of the mouth and throat) and also behavioral patterns (pitch, accent and style of speaking). This paper explains an approach of voice recognition to identify the ethnicity of Malaysian people. Pitch and 13 Mel-Frequency Cepstrum Coefficients (MFCCs) are extracted from 52 recorded continuous speech in Malay for use as features to train the classifiers using Tree, Naïve Bayes, Nearest Neighbors and Support Vector Machine (SVM) and another 10 recorded speeches are used for testing. The results reveal that the use of a combination of pitch and 13 coefficients for features extraction and training the data using SVM provide better accuracy (57.7%) than the use of only 13 coefficients (53.8%).
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c6f4db3e3f2ec41cc9d92474dbeba849 https://zenodo.org/record/5709893 Zobrazit plný text záznamu