Efficient Transfer Learning for Neural Network Language Models

Autor:	Richard V. Field, Jeremy D. Wendt, Samuel N. Richter, Jacek Skryzalin, Hamilton E. Link
Rok vydání:	2018
Předmět:	Small data business.industry Computer science Deep learning 020206 networking & telecommunications 02 engineering and technology General-purpose language Construct (python library) 010501 environmental sciences computer.software_genre 01 natural sciences Convolutional neural network 0202 electrical engineering electronic engineering information engineering Artificial intelligence Language model business Transfer of learning computer Natural language processing Natural language 0105 earth and related environmental sciences
Zdroj:	ASONAM
Popis:	We apply transfer learning techniques to create topically and/or stylistically biased natural language models from small data samples, given generic long short-term memory (LSTM) language models trained on larger data sets. Although LSTM language models are powerful tools with wide-ranging applications, they require enormous amounts of data and time to train. Thus, we build general purpose language models that take advantage of large standing corpora and computational resources proactively, allowing us to build more specialized analytical tools from smaller data sets on demand. We show that it is possible to construct a language model from a small, focused corpus by first training an LSTM language model on a large corpus (e.g., the text from English Wikipedia) and then retraining only the internal transition model parameters on the smaller corpus. We also show that a single general language model can be reused through transfer learning to create many distinct special purpose language models quickly with modest amounts of data.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::c57193bf5e8dcfb00de020f94bd112d9 https://doi.org/10.1109/asonam.2018.8508304 Zobrazit plný text záznamu