Zobrazeno 1 - 10
of 335
pro vyhledávání: '"Choi Jungwook"'
Autor:
Lee, Janghwan, Park, Jiwoong, Kim, Jinseok, Kim, Yongjik, Oh, Jungju, Oh, Jinwook, Choi, Jungwook
Scaling Large Language Models (LLMs) with extended context lengths has increased the need for efficient low-bit quantization to manage their substantial computational demands. However, reducing precision to 4 bits frequently degrades performance due
Externí odkaz:
http://arxiv.org/abs/2411.09909
Handling long input contexts remains a significant challenge for Large Language Models (LLMs), particularly in resource-constrained environments such as mobile devices. Our work aims to address this limitation by introducing InfiniPot, a novel KV cac
Externí odkaz:
http://arxiv.org/abs/2410.01518
Autor:
Li Xinlin, Wang Rixuan, Wang Leilei, Li Aizhen, Tang Xiaowu, Choi Jungwook, Zhang Pengfei, Jin Ming Liang, Joo Sang Woo
Publikováno v:
Nanotechnology Reviews, Vol 9, Iss 1, Pp 1183-1191 (2020)
Development of stretchable wearable devices requires essential materials with high level of mechanical and electrical properties as well as scalability. Recently, silicone rubber-based elastic polymers with incorporated conductive fillers (metal part
Externí odkaz:
https://doaj.org/article/a8e27fadc4ea4f72ac2048bdef267cbe
Pillar-based 3D object detection has gained traction in self-driving technology due to its speed and accuracy facilitated by the artificial densification of pillars for GPU-friendly processing. However, dense pillar processing fundamentally wastes co
Externí odkaz:
http://arxiv.org/abs/2408.13798
The rapid advancement of large language models (LLMs) has facilitated their transformation into conversational chatbots that can grasp contextual nuances and generate pertinent sentences, closely mirroring human values through advanced techniques suc
Externí odkaz:
http://arxiv.org/abs/2407.03051
Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization
Autor:
Lee, Janghwan, Kim, Minsoo, Baek, Seungcheol, Hwang, Seok Joong, Sung, Wonyong, Choi, Jungwook
Large Language Models (LLMs) are proficient in natural language processing tasks, but their deployment is often restricted by extensive parameter sizes and computational demands. This paper focuses on post-training quantization (PTQ) in LLMs, specifi
Externí odkaz:
http://arxiv.org/abs/2311.05161
Autor:
Kim, Minsoo, Lee, Sihwa, Lee, Janghwan, Hong, Sukjin, Chang, Du-Seong, Sung, Wonyong, Choi, Jungwook
Generative Language Models (GLMs) have shown impressive performance in tasks such as text generation, understanding, and reasoning. However, the large model size poses challenges for practical deployment. To solve this problem, Quantization-Aware Tra
Externí odkaz:
http://arxiv.org/abs/2308.06744
Autor:
Lee, Minjae, Park, Seongmin, Kim, Hyungmin, Yoon, Minyong, Lee, Janghwan, Choi, Jun Won, Kim, Nam Sung, Kang, Mingu, Choi, Jungwook
3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting stringent resource and latency requirements. PointPillars, a widely adopted bird's-eye view (BEV)
Externí odkaz:
http://arxiv.org/abs/2305.07522
Pre-trained Transformer models such as BERT have shown great success in a wide range of applications, but at the cost of substantial increases in model complexity. Quantization-aware training (QAT) is a promising method to lower the implementation co
Externí odkaz:
http://arxiv.org/abs/2302.11812
Transformer-based deep neural networks have achieved great success in various sequence applications due to their powerful ability to model long-range dependency. The key module of Transformer is self-attention (SA) which extracts features from the en
Externí odkaz:
http://arxiv.org/abs/2301.12444