Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models

Autor:	Dejan Dodić, Dušan Regodić
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	data tokenization deep learning cuda out of memory gpu memory optimization machine learning natural language processing (nlp) Engineering (General). Civil engineering (General) TA1-2040
Zdroj:	Tehnički Vjesnik, Vol 31, Iss 6, Pp 1995-2002 (2024)
Druh dokumentu:	article
ISSN:	1330-3651 1848-6339
DOI:	10.17559/TV-20231218001216
Popis:	In the current landscape of advanced natural language processing (NLP), managing GPU memory effectively is crucial. This paper delves into new tokenization methods and data handling to enhance NLP model efficiency, focusing on avoiding "CUDA out of memory" errors. It examines how sophisticated tokenization and managing text lengths in large datasets can boost model performance. These insights are vital for optimizing resources and scaling NLP models, especially with limited GPU memory. The paper also contextualizes NLP challenges, underlining the significance of memory optimization amidst growing language model complexities. It reviews key NLP technologies, including transformer models, and addresses their memory optimization challenges. Moreover, it underscores the paper's role in developing innovative techniques for more effective memory optimization, linking it to ongoing research and trends in NLP. This work aims to progress natural language processing methods and make AI technologies more accessible.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/d8fc85d17e374d19b61d575868f06072 Zobrazit plný text záznamu View record in DOAJ