Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Vishnu, Vishwajit Kumar"'
Training Memory-based transformers can require a large amount of memory and can be quite inefficient. We propose a novel two-phase training mechanism and a novel regularization technique to improve the training efficiency of memory-based transformers
Externí odkaz:
http://arxiv.org/abs/2311.08123