Výsledky vyhledávání

Report

Estimating Depth of Monocular Panoramic Image with Teacher-Student Model Fusing Equirectangular and Spherical Representations

Autor: Liu, Jingguo, Xu, Yijun, Li, Shigang, Li, Jianfeng

Disconnectivity and distortion are the two problems which must be coped with when processing 360 degrees equirectangular images. In this paper, we propose a method of estimating the depth of monocular panoramic image with a teacher-student model fusi

Externí odkaz: http://arxiv.org/abs/2405.16858

Zobrazit plný text záznamu

Report

TRANSOM: An Efficient Fault-Tolerant System for Training LLMs

Autor: Wu, Baodong, Xia, Lei, Li, Qingping, Li, Kangyu, Chen, Xu, Guo, Yongqiang, Xiang, Tieyao, Chen, Yuheng, Li, Shigang

Large language models (LLMs) with hundreds of billions or trillions of parameters, represented by chatGPT, have achieved profound impact on various fields. However, training LLMs with super-large-scale parameters requires large high-performance GPU c

Externí odkaz: http://arxiv.org/abs/2310.10046

Zobrazit plný text záznamu

Report

A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network

Autor: Blach, Nils, Besta, Maciej, De Sensi, Daniele, Domke, Jens, Harake, Hussein, Li, Shigang, Iff, Patrick, Konieczny, Marek, Lakhotia, Kartik, Kubicek, Ales, Ferrari, Marcel, Petrini, Fabrizio, Hoefler, Torsten

Publikováno v: Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI '24) Santa Clara, CA, USA April 16-18, 2024

Novel low-diameter network topologies such as Slim Fly (SF) offer significant cost and power advantages over the established Fat Tree, Clos, or Dragonfly. To spearhead the adoption of low-diameter networks, we design, implement, deploy, and evaluate

Externí odkaz: http://arxiv.org/abs/2310.03742

Zobrazit plný text záznamu

Report

Co-design Hardware and Algorithm for Vector Search

Autor: Jiang, Wenqi, Li, Shigang, Zhu, Yu, Licht, Johannes de Fine, He, Zhenhao, Shi, Runbin, Renggli, Cedric, Zhang, Shuai, Rekatsinas, Theodoros, Hoefler, Torsten, Alonso, Gustavo

Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluat

Externí odkaz: http://arxiv.org/abs/2306.11182

Zobrazit plný text záznamu

Report

ASDL: A Unified Interface for Gradient Preconditioning in PyTorch

Autor: Osawa, Kazuki, Ishikawa, Satoki, Yokota, Rio, Li, Shigang, Hoefler, Torsten

Gradient preconditioning is a key technique to integrate the second-order information into gradients for improving and extending gradient-based learning algorithms. In deep learning, stochasticity, nonconvexity, and high dimensionality lead to a wide

Externí odkaz: http://arxiv.org/abs/2305.04684

Zobrazit plný text záznamu

Report

An End-to-End Network for Upright Adjustment of Panoramic Images

Autor: Chen, Heyu, Li, Jianfeng, Li, Shigang

Nowadays, panoramic images can be easily obtained by panoramic cameras. However, when the panoramic camera orientation is tilted, a non-upright panoramic image will be captured. Existing upright adjustment models focus on how to estimate more accurat

Externí odkaz: http://arxiv.org/abs/2304.05556

Zobrazit plný text záznamu

Report

AutoDDL: Automatic Distributed Deep Learning with Near-Optimal Bandwidth Cost

Autor: Chen, Jinfan, Li, Shigang, Gun, Ran, Yuan, Jinhui, Hoefler, Torsten

Recent advances in deep learning are driven by the growing scale of computation, data, and models. However, efficiently training large-scale models on distributed systems requires an intricate combination of data, operator, and pipeline parallelism,

Externí odkaz: http://arxiv.org/abs/2301.06813

Zobrazit plný text záznamu

Report

PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices

Autor: Osawa, Kazuki, Li, Shigang, Hoefler, Torsten

Pipeline parallelism enables efficient training of Large Language Models (LLMs) on large-scale distributed accelerator clusters. Yet, pipeline bubbles during startup and tear-down reduce the utilization of accelerators. Although efficient pipeline sc

Externí odkaz: http://arxiv.org/abs/2211.14133

Zobrazit plný text záznamu

Report

Efficient Quantized Sparse Matrix Operations on Tensor Cores

Autor: Li, Shigang, Osawa, Kazuki, Hoefler, Torsten

The exponentially growing model size drives the continued success of deep learning, but it brings prohibitive computation and memory cost. From the algorithm perspective, model sparsification and quantization have been studied to alleviate the proble

Externí odkaz: http://arxiv.org/abs/2209.06979

Zobrazit plný text záznamu

Report

HammingMesh: A Network Topology for Large-Scale Deep Learning

Autor: Hoefler, Torsten, Bonato, Tommaso, De Sensi, Daniele, Di Girolamo, Salvatore, Li, Shigang, Heddes, Marco, Belk, Jon, Goel, Deepak, Castro, Miguel, Scott, Steve

Numerous microarchitectural optimizations unlocked tremendous processing power for deep neural networks that in turn fueled the AI revolution. With the exhaustion of such optimizations, the growth of modern AI is now gated by the performance of train

Externí odkaz: http://arxiv.org/abs/2209.01346

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání