Zobrazeno 1 - 10
of 4 402
pro vyhledávání: '"Speech transcription"'
Autor:
Lian, Jiachen, Feng, Carly, Farooqi, Naasir, Li, Steve, Kashyap, Anshul, Cho, Cheol Jun, Wu, Peter, Netzorg, Robbie, Li, Tingle, Anumanchipalli, Gopala Krishna
Dysfluent speech modeling requires time-accurate and silence-aware transcription at both the word-level and phonetic-level. However, current research in dysfluency modeling primarily focuses on either transcription or detection, and the performance o
Externí odkaz:
http://arxiv.org/abs/2312.12810
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Large-scale, weakly-supervised speech recognition models, such as Whisper, have demonstrated impressive results on speech recognition across domains and languages. However, their application to long audio transcription via buffered or sliding window
Externí odkaz:
http://arxiv.org/abs/2303.00747
Self-training has been shown to be helpful in addressing data scarcity for many domains, including vision, speech, and language. Specifically, self-training, or pseudo-labeling, labels unsupervised data and adds that to the training pool. In this wor
Externí odkaz:
http://arxiv.org/abs/2212.09982
Masked language models have revolutionized natural language processing systems in the past few years. A recently introduced generalization of masked language models called warped language models are trained to be more robust to the types of errors th
Externí odkaz:
http://arxiv.org/abs/2103.14580
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
New Advances in machine learning have made Automated Speech Recognition (ASR) systems practical and more scalable. These systems, however, pose serious privacy threats as speech is a rich source of sensitive acoustic and textual information. Although
Externí odkaz:
http://arxiv.org/abs/1909.04198
This paper explores contexts associated with errors in transcrip-tion of spontaneous speech, shedding light on human perceptionof disfluencies and other conversational speech phenomena. Anew version of the Switchboard corpus is provided with disfluen
Externí odkaz:
http://arxiv.org/abs/1904.04398
Autor:
Anastasopoulos, Antonis, Chiang, David
Recently proposed data collection frameworks for endangered language documentation aim not only to collect speech in the language of interest, but also to collect translations into a high-resource language that will render the collected resource inte
Externí odkaz:
http://arxiv.org/abs/1803.08991