Explicit Word Density Estimation for Language Modelling

Autor: Andonov, Jovan, Ganea, Octavian, Grnarova, Paulina, Bécigneul, Gary, Hofmann, Thomas
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: Language Modelling has been a central part of Natural Language Processing for a very long time and in the past few years LSTM-based language models have been the go-to method for commercial language modeling. Recently, it has been shown that when looking at language modelling from a matrix factorization point of view, the final Softmax layer limits the expressiveness of the model, by putting an upper bound on the rank of the resulting matrix. Additionally, a new family of neural networks based called NeuralODEs, has been introduced as a continuous alternative to Residual Networks. Moreover, it has been shown that there is a connection between these models and Normalizing Flows. In this work we propose a new family of language models based on NeuralODEs and the continuous analogue of Normalizing Flows and manage to improve on some of the baselines.
Comment: Master's thesis
Databáze: arXiv