Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Thallinger, Bernhard"'
We demonstrate that carefully adjusting the tokenizer of the Whisper speech recognition model significantly improves the precision of word-level timestamps when applying dynamic time warping to the decoder's cross-attention scores. We fine-tune the m
Externí odkaz:
http://arxiv.org/abs/2408.16589