Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Wanwang Yin"'
Autor:
Fangfang Liu, Wenjing Ma, Yuwen Zhao, Daokun Chen, Yi Hu, Qinglin Lu, WanWang Yin, Xinhui Yuan, Lijuan Jiang, Hao Yan, Min Li, Hongsen Wang, Xinyu Wang, Chao Yang
Publikováno v:
CCF Transactions on High Performance Computing. 5:56-71
Publikováno v:
SC22: International Conference for High Performance Computing, Networking, Storage and Analysis.
Publikováno v:
Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.
Publikováno v:
SC
We study and evaluate performance optimization techniques for the HPCG benchmark on the newest generation Sunway supercomputer. Specifically, a two-level blocking scheme is proposed to expose adequate parallelism in the symmetric Gauss-Seidel kernel
Autor:
Fangfang Liu, Wenjing Ma, Yuwen Zhao, Daokun Chen, Yi Hu, Qinglin Lu, WanWang Yin, Xinhui Yuan, Lijuan Jiang, Hao Yan, Min Li, Hongsen Wang, Xinyu Wang, Chao Yang
Publikováno v:
CCF Transactions on High Performance Computing. 5:97-97
Publikováno v:
ACM Transactions on Architecture and Code Optimization. 15:1-20
In this article, we present some key techniques for optimizing HPCG on Sunway TaihuLight and demonstrate how to achieve high performance in memory-bound applications by exploiting specific characteristics of the hardware architecture. In particular,
Publikováno v:
ICCIP
This paper presents analysis and optimizations for High Performance Conjugate Gradient benchmark (HPCG) on the Sunway many-core processor. For modern multi-core and many-core processors, HPCG always presents a poor performance and under-utilizes comp
Autor:
Guangwen Yang, Haohuan Fu, Weiguo Liu, Xiaofei Chen, Conghui He, Zekun Yin, Zhenguo Zhang, Tingjian Zhang, Wenqiang Zhang, Wanwang Yin, Wei Xue, Bingwei Chen
Publikováno v:
SC
This paper reports our large-scale nonlinear earthquake simulation software on Sunway TaihuLight. Our innovations include: (1) a customized parallelization scheme that employs the 10 million cores efficiently at both the process and the thread levels
Autor:
Rongfen Lin, Chao Yang, Qiao Sun, Lijuan Jiang, Peng Zhang, Wanwang Yin, Wenjing Ma, Yulong Ao, Fangfang Liu
Publikováno v:
ICPP
The matrix-matrix multiplication is an essential building block that can be found in various scientific and engineering applications. High-performance implementations of the matrix-matrix multiplication on state-of-the-art processors may be of great
Autor:
Jidong Zhai, Wenguang Chen, Weimin Zheng, Xiongchao Tang, Youwei Zhuo, Wanwang Yin, Bowen Yu, Heng Lin
Publikováno v:
IPDPS
Interest has recently grown in efficiently analyzing unstructured data such as social network graphs and protein structures. A fundamental graph algorithm for doing such task is the Breadth-First Search (BFS) algorithm, the foundation for many other