Zobrazeno 1 - 10
of 417 869
pro vyhledávání: '"optimizations."'
Autor:
Kumar, Harsh, Govindarajan, R.
Even in the era of Deep Learning based methods, traditional machine learning methods with large data sets continue to attract significant attention. However, we find an apparent lack of a detailed performance characterization of these methods in the
Externí odkaz:
http://arxiv.org/abs/2412.19051
Autor:
Yuan, Xiulong, Yan, Xu, Shen, Wenting, Qiu, Xiafei, Wang, Ang, Zhang, Jie, Li, Yong, Lin, Wei
Publikováno v:
[1]"NeurIPS BladeDISC++: Memory Optimizations Based On Symbolic Shape" Neurips.cc, 2024. https://neurips.cc/virtual/2024/103601 (accessed Dec. 22, 2024)
Recent deep learning workloads exhibit dynamic characteristics, leading to the rising adoption of dynamic shape compilers. These compilers can generate efficient kernels for dynamic shape graphs characterized by a fixed graph topology and uncertain t
Externí odkaz:
http://arxiv.org/abs/2412.16985
Autor:
Italiano, Davide, Cummins, Chris
Compilers are complex, and significant effort has been expended on testing them. Techniques such as random program generation and differential testing have proved highly effective and have uncovered thousands of bugs in production compilers. The majo
Externí odkaz:
http://arxiv.org/abs/2501.00655
Autor:
Song, Mingcong, Tang, Xinru, Hou, Fengfan, Li, Jing, Wei, Wei, Ma, Yipeng, Xiao, Runqiu, Si, Hongjie, Jiang, Dingcheng, Yin, Shouyi, Hu, Yang, Long, Guoping
Meeting growing demands for low latency and cost efficiency in production-grade large language model (LLM) serving systems requires integrating advanced optimization techniques. However, dynamic and unpredictable input-output lengths of LLM, compound
Externí odkaz:
http://arxiv.org/abs/2412.18106
Mixture-of-experts-based (MoE-based) diffusion models have shown their scalability and ability to generate high-quality images, making them a promising choice for efficient model scaling. However, they rely on expert parallelism across GPUs, necessit
Externí odkaz:
http://arxiv.org/abs/2411.16786
Ionization chambers are essential for activity determinations in radionuclide metrology. We have developed a high-precision integrating-differentiating (int-diff) system for measuring small currents. It is anticipated to enhance the ionization curren
Externí odkaz:
http://arxiv.org/abs/2412.18252
Autor:
Motonaga, Shoya
We study optimization problems in ergodic theory from the view point of minimax problems. We give minimax characterizations of maximum ergodic averages involving time averages. Our approach works for the abstract variational principle of generalized
Externí odkaz:
http://arxiv.org/abs/2411.17615
Developing an efficient code for large, multiscale astrophysical simulations is crucial in preparing the upcoming era of exascale computing. RAMSES is an astrophysical simulation code that employs parallel processing based on the Message Passing Inte
Externí odkaz:
http://arxiv.org/abs/2411.14631
With heterogeneous systems, the number of GPUs per chip increases to provide computational capabilities for solving science at a nanoscopic scale. However, low utilization for single GPUs defies the need to invest more money for expensive ccelerators
Externí odkaz:
http://arxiv.org/abs/2408.10143