Arabic Dialect Identification Using Different Machine Learning Methods

Autor: Khalid M.O. Nahar, Obaida M. Al-Hazaimeh, Ashraf Abu-Ein, Mohammed Azmi Al-Betar
Rok vydání: 2022
Popis: Arabic Dialect Identification is the process of identifying the speaker’s dialect based on several features in the corresponding acoustic wave. In this research, machine learning models for detecting Arabic dialects from acoustic wave is proposed using ADI17 corpus with a short duration (i.e., less than 5 seconds) which contains 1717 wave files with a total of 2 hours, 2 minutes, and 11 seconds. The Mel-Frequency Cepstrum Coefficients (MFCC) and Triangular Filter Bank Cepstral Coefficients (TFCC) are used for features extraction from the input acoustic signal. The extracted features represent the speaker’s features matrix which is used for automatic recognition based on K-Nearest Neighbor (KNN), Random Forest (RF), Multi-Layer Perceptron (MLP), and Artificial Neural Networks (ANN). The Experimental results are validated using MFCC features with an accuracy of 76% for KNN, 64% for RF, 41% for ANN, and 34% for the MLP model, while the obtained results using TFCC features were 62% for KNN, 60% for RF, 42% for ANN, and 33% for the MLP model.
Databáze: OpenAIRE