Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Matthew Snover"'
Recent advances in unsupervised representation learning have demonstrated the impact of pretraining on large amounts of read speech. We adapt these techniques for domain adaptation in low-resource -- both in terms of data and compute -- conversationa
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f9f33d8a245d6b721f4c8b8a83d731e8
http://arxiv.org/abs/2110.15836
http://arxiv.org/abs/2110.15836
Publikováno v:
ICASSP
Automatic speech recognition (ASR) systems are highly sensitive to train-test domain mismatch. However, because transcription is often prohibitively expensive, it is important to be able to make use of available transcribed out-of-domain data. We add
Publikováno v:
ICASSP
We present an importance sampling based approach to the active learning problem of selecting additional training data to supplement a seed model. Our proposed Δ-AUC selection optimizes AUC improvement in keyword search and is evaluated on the Spanis
Publikováno v:
Machine Translation. 23:169-179
Recent efforts to develop new machine translation evaluation methods have tried to account for allowable wording differences either in terms of syntactic structure or synonyms/paraphrases. This paper primarily considers syntactic structure, combining
Publikováno v:
Machine Translation. 23:117-127
This paper describes a new evaluation metric, TER-Plus (TERp) for automatic evaluation of machine translation (MT). TERp is an extension of Translation Edit Rate (TER). It builds on the success of TER as an evaluation metric and alignment tool and ad
Publikováno v:
WMT@EACL
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the correlation of the scores they assign to MT output with human judgments of translation performance. Different types of human judgments, such as Fluency, Ad
Publikováno v:
EMNLP
Traditionally, statistical machine translation systems have relied on parallel bi-lingual data to train a translation model. While bi-lingual parallel data are expensive to generate, monolingual data are relatively common. Yet monolingual data have b
Autor:
Matthew Lease, Bonnie J. Dorr, Mary P. Harper, L. Yung, Brian Roark, Yang Liu, John Hale, Matthew Snover, Anna Krasnyanskaya, Robin Stewart, Izhak Shafran
Publikováno v:
ICASSP (1)
We present a reranking approach to sentence-like unit (SU) boundary detection, one of the EARS metadata extraction tasks. Techniques for generating relatively small n-best lists with high oracle accuracy are presented. For each candidate, features ar
Autor:
Yang Liu, Izhak Shafran, Anna Krasnyanskaya, Brian Roark, John Hale, L. Yung, Matthew Snover, Bonnie J. Dorr, Mary P. Harper, Matthew Lease, Robin Stewart
Publikováno v:
ACL
A grammatical method of combining two kinds of speech repair cues is presented. One cue, prosodic disjuncture, is detected by a decision tree-based ensemble classifier that uses acoustic cues to identify where normal prosody seems to be interrupted (
Publikováno v:
HLT-NAACL (Short Papers)
This paper describes a transformation-based learning approach to disfluency detection in speech transcripts using primarily lexical features. Our method produces comparable results to two other systems that make heavy use of prosodic features, thus d