ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

Autor: Shi, Jiatong, Wang, Shih-Heng, Chen, William, Bartelds, Martijn, Kumar, Vanya Bannihatti, Tian, Jinchuan, Chang, Xuankai, Jurafsky, Dan, Livescu, Karen, Lee, Hung-yi, Watanabe, Shinji
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: ML-SUPERB evaluates self-supervised learning (SSL) models on the tasks of language identification and automatic speech recognition (ASR). This benchmark treats the models as feature extractors and uses a single shallow downstream model, which can be fine-tuned for a downstream task. However, real-world use cases may require different configurations. This paper presents ML-SUPERB~2.0, which is a new benchmark for evaluating pre-trained SSL and supervised speech models across downstream models, fine-tuning setups, and efficient model adaptation approaches. We find performance improvements over the setup of ML-SUPERB. However, performance depends on the downstream model design. Also, we find large performance differences between languages and datasets, suggesting the need for more targeted approaches to improve multilingual ASR performance.
Comment: Accepted by Interspeech 2024
Databáze: arXiv