ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

Autor:	Shi, Jiatong, Wang, Shih-Heng, Chen, William, Bartelds, Martijn, Kumar, Vanya Bannihatti, Tian, Jinchuan, Chang, Xuankai, Jurafsky, Dan, Livescu, Karen, Lee, Hung-yi, Watanabe, Shinji
Rok vydání:	2024
Předmět:	Computer Science - Sound Computer Science - Computation and Language Electrical Engineering and Systems Science - Audio and Speech Processing
Druh dokumentu:	Working Paper
Popis:	ML-SUPERB evaluates self-supervised learning (SSL) models on the tasks of language identification and automatic speech recognition (ASR). This benchmark treats the models as feature extractors and uses a single shallow downstream model, which can be fine-tuned for a downstream task. However, real-world use cases may require different configurations. This paper presents ML-SUPERB~2.0, which is a new benchmark for evaluating pre-trained SSL and supervised speech models across downstream models, fine-tuning setups, and efficient model adaptation approaches. We find performance improvements over the setup of ML-SUPERB. However, performance depends on the downstream model design. Also, we find large performance differences between languages and datasets, suggesting the need for more targeted approaches to improve multilingual ASR performance. Comment: Accepted by Interspeech 2024
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2406.08641 Zobrazit plný text záznamu View this record from Arxiv