Zobrazeno 1 - 10
of 1 736
pro vyhledávání: '"Lee, Jinho"'
Autor:
Lee, Hyeyoon, Choi, Kanghyun, Kwon, Dain, Park, Sunjong, Jaiswal, Mayoore Selvarasa, Park, Noseong, Choi, Jonghyun, Lee, Jinho
Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issue
Externí odkaz:
http://arxiv.org/abs/2406.15635
Autor:
Yim, Jinkyu, Song, Jaeyong, Choi, Yerim, Lee, Jaebeen, Jung, Jaewon, Jang, Hongsun, Lee, Jinho
Training large language models (LLMs) is known to be challenging because of the huge computational and memory capacity requirements. To address these issues, it is common to use a cluster of GPUs with 3D parallelism, which splits a model along the da
Externí odkaz:
http://arxiv.org/abs/2405.18093
Autor:
Noh, Si Ung, Hong, Junguk, Lim, Chaemin, Park, Seongyeon, Kim, Jeehyun, Kim, Hanjun, Kim, Youngsok, Lee, Jinho
Recent dual in-line memory modules (DIMMs) are starting to support processing-in-memory (PIM) by associating their memory banks with processing elements (PEs), allowing applications to overcome the data movement bottleneck by offloading memory-intens
Externí odkaz:
http://arxiv.org/abs/2404.08871
Adversarial robustness of the neural network is a significant concern when it is applied to security-critical domains. In this situation, adversarial distillation is a promising option which aims to distill the robustness of the teacher network to im
Externí odkaz:
http://arxiv.org/abs/2403.06668
The recent huge advance of Large Language Models (LLMs) is mainly driven by the increase in the number of parameters. This has led to substantial memory capacity requirements, necessitating the use of dozens of GPUs just to meet the capacity. One pop
Externí odkaz:
http://arxiv.org/abs/2403.06664
With the advance in genome sequencing technology, the lengths of deoxyribonucleic acid (DNA) sequencing results are rapidly increasing at lower prices than ever. However, the longer lengths come at the cost of a heavy computational burden on aligning
Externí odkaz:
http://arxiv.org/abs/2403.06478
Graph neural networks (GNNs) are one of the most rapidly growing fields within deep learning. According to the growth in the dataset and the model size used for GNNs, an important problem is that it becomes nearly impossible to keep the whole network
Externí odkaz:
http://arxiv.org/abs/2311.06837
Training large deep neural network models is highly challenging due to their tremendous computational and memory requirements. Blockwise distillation provides one promising method towards faster convergence by splitting a large model into multiple sm
Externí odkaz:
http://arxiv.org/abs/2301.12443
Autor:
Song, Jaeyong, Yim, Jinkyu, Jung, Jaewon, Jang, Hongsun, Kim, Hyung-Jin, Kim, Youngsok, Lee, Jinho
In training of modern large natural language processing (NLP) models, it has become a common practice to split models using 3D parallelism to multiple GPUs. Such technique, however, suffers from a high overhead of inter-node communication. Compressin
Externí odkaz:
http://arxiv.org/abs/2301.09830
Graph convolutional networks (GCNs) are becoming increasingly popular as they overcome the limited applicability of prior neural networks. A GCN takes as input an arbitrarily structured graph and executes a series of layers which exploit the graph's
Externí odkaz:
http://arxiv.org/abs/2301.10388