Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Wolleb, Benoist"'
Autor:
Wolleb, Benoist, Silvestri, Romain, Vernikos, Giorgos, Dolamic, Ljiljana, Popescu-Belis, Andrei
Subword tokenization is the de facto standard for tokenization in neural language models and machine translation systems. Three advantages are frequently cited in favor of subwords: shorter encoding of frequent tokens, compositionality of subwords, a
Externí odkaz:
http://arxiv.org/abs/2306.01393