Výsledky vyhledávání - "Fang, Jianbin"

Report

Programming Bare-Metal Accelerators with Heterogeneous Threading Models: A Case Study of Matrix-3000

Autor: Fang, Jianbin, Zhang, Peng, Huang, Chun, Tang, Tao, Lu, Kai, Wang, Ruibo, Wang, Zheng

Publikováno v: Frontiers of Information Technology & Electronic Engineering, 2022

As the hardware industry moves towards using specialized heterogeneous many-cores to avoid the effects of the power wall, software developers are finding it hard to deal with the complexity of these systems. This article shares our experience when de

Externí odkaz: http://arxiv.org/abs/2210.12230

Zobrazit plný text záznamu

Report

Parallel Programming Models for Heterogeneous Many-Cores : A Survey

Autor: Fang, Jianbin, Huang, Chun, Tang, Tao, Wang, Zheng

Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential can only b

Externí odkaz: http://arxiv.org/abs/2005.04094

Zobrazit plný text záznamu

Report

Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures: A Machine Learning Based Approach

Autor: Zhang, Peng, Fang, Jianbin, Yang, Canqun, Huang, Chun, Tang, Tao, Wang, Zheng

This article presents an automatic approach to quickly derive a good solution for hardware resource partition and task granularity for task-based parallel applications on heterogeneous many-core architectures. Our approach employs a performance model

Externí odkaz: http://arxiv.org/abs/2003.04294

Zobrazit plný text záznamu

Report

Characterizing Scalability of Sparse Matrix-Vector Multiplications on Phytium FT-2000+ Many-cores

Autor: Chen, Donglin, Fang, Jianbin, Xu, Chuanfu, Chen, Shizhao, Wang, Zheng

Understanding the scalability of parallel programs is crucial for software optimization and hardware architecture design. As HPC hardware is moving towards many-core design, it becomes increasingly difficult for a parallel program to make effective u

Externí odkaz: http://arxiv.org/abs/1911.08779

Zobrazit plný text záznamu

Report

To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference

Autor: Qin, Qing, Ren, Jie, Yu, Jialong, Gao, Ling, Wang, Hai, Zheng, Jie, Feng, Yansong, Fang, Jianbin, Wang, Zheng

The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address the compu

Externí odkaz: http://arxiv.org/abs/1810.08899

Zobrazit plný text záznamu

Report

Optimizing Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures

Autor: Chen, Shizhao, Fang, Jianbin, Chen, Donglin, Xu, Chuanfu, Wang, Zheng

Sparse matrix vector multiplication (SpMV) is one of the most common operations in scientific and high-performance applications, and is often responsible for the application performance bottleneck. While the sparse matrix representation has a signifi

Externí odkaz: http://arxiv.org/abs/1805.11938

Zobrazit plný text záznamu

Report

Tuning Streamed Applications on Intel Xeon Phi: A Machine Learning Based Approach

Autor: Zhang, Peng, Fang, Jianbin, Tang, Tao, Yang, Canqun, Wang, Zheng

Many-core accelerators, as represented by the XeonPhi coprocessors and GPGPUs, allow software to exploit spatial and temporal sharing of computing resources to improve the overall system performance. To unlock this performance potential requires soft

Externí odkaz: http://arxiv.org/abs/1802.02760

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Report

Streaming Applications on Heterogeneous Platforms

Autor: Li, Zhaokui, Fang, Jianbin, Tang, Tao, Chen, Xuhao, Yang, Canqun

Using multiple streams can improve the overall system performance by mitigating the data transfer overhead on heterogeneous systems. Currently, very few cases have been streamed to demonstrate the streaming performance impact and a systematic investi

Externí odkaz: http://arxiv.org/abs/1608.03044

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání