Výsledky vyhledávání

Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee

Autor: Qinyi Luo, Yanzhi Wang, Depei Qian, Youwei Zhuo, Xuehai Qian, Jingji Chen, Hailong Yang, Gengyu Rao

Publikováno v: ACM Transactions on Computer Systems. 37:1-37

To hide the complexity of the underlying system, graph processing frameworks ask programmers to specify graph computations in user-defined functions (UDFs) of graph-oriented programming model. Due to the nature of distributed execution, current frame

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::cb6ec234eb0d5fe56417e470026dd6a6
https://doi.org/10.1145/3453681

Zobrazit plný text záznamu

SympleGraph: distributed graph processing with precise loop-carried dependency guarantee

Autor: Youwei Zhuo, Hailong Yang, Xuehai Qian, Qinyi Luo, Depei Qian, Jingji Chen, Yanzhi Wang

Publikováno v: PLDI

Graph analytics is an important way to understand relationships in real-world applications. At the age of big data, graphs have grown to billions of edges. This motivates distributed graph processing. Graph processing frameworks ask programmers to sp

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::7d77720d970393b79478b0ebe3918e9a
https://doi.org/10.1145/3385412.3385961

Zobrazit plný text záznamu

Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized Training

Autor: Qinyi Luo, Jiaao He, Youwei Zhuo, Xuehai Qian

Publikováno v: ASPLOS

Distributed deep learning training usually adopts All-Reduce as the synchronization mechanism for data parallel algorithms due to its high performance in homogeneous environment. However, its performance is bounded by the slowest worker among all wor

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2487f7248152f197eda6b53919f063de
https://doi.org/10.1145/3373376.3378499

Zobrazit plný text záznamu

AccPar: Tensor Partitioning for Heterogeneous Deep Learning Accelerators

Autor: Youwei Zhuo, Yi Chen, Linghao Song, Hai Li, Xuehai Qian, Fan Chen

Publikováno v: HPCA

Deep neural network (DNN) accelerators as an example of domain-specific architecture have demonstrated great success in DNN inference. However, the architecture acceleration for equally important DNN training has not yet been fully studied. With data

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::4a0b7b149cb3baf52497fac09c43a759
https://doi.org/10.1109/hpca47549.2020.00036

Zobrazit plný text záznamu

GraphQ

Autor: Chao Wang, Yanzhi Wang, Mingxing Zhang, Niu Dimin, Rui Wang, Youwei Zhuo, Xuehai Qian

Publikováno v: MICRO

Processing-In-Memory (PIM) architectures based on recent technology advances (e.g., Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing solutions did not address the key challenge of graph processing---irregular da

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::d93468d6500017784db02f428e2a245d
https://doi.org/10.1145/3352460.3358256

Zobrazit plný text záznamu

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

Autor: Qinru Qiu, Yanzhi Wang, Zhe Li, Caiwen Ding, Wenyao Xu, Wujie Wen, Youwei Zhuo, Xuehai Qian, Siyue Wang, Chang Liu, Xue Lin

Publikováno v: HPCA

Recurrent Neural Networks (RNNs) are becoming increasingly important for time series-related applications which require efficient and real-time implementations. The two major types are Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) netw

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::919904bc1f788bce174b33af791fa094
https://doi.org/10.1109/hpca.2019.00028

Zobrazit plný text záznamu

HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

Autor: Jiachen Mao, Yiran Chen, Hai Li, Xuehai Qian, Youwei Zhuo, Linghao Song

Publikováno v: HPCA

With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used in many domains. To achieve high performance and energy efficiency, hardware acceleration (especially inference) of DNNs is intensively studie

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8287e1c28bffd7862caa97ae301da201
http://arxiv.org/abs/1901.02067

Zobrazit plný text záznamu

CSE: Parallel Finite State Machines with Convergence Set Enumeration

Autor: Yanzhi Wang, Youwei Zhuo, Zhongzhi Luan, Jidong Zhai, Xuehai Qian, Jinglei Cheng, Qinyi Luo

Publikováno v: MICRO

Finite State Machine (FSM) is known to be embarrassingly sequential because the next state depends on the current state and input symbol. Enumerative FSM breaks the data dependencies by cutting the input symbols into segments and processing all segme

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2253fd94d032acda039f3b846e4d2b5a
https://doi.org/10.1109/micro.2018.00012

Zobrazit plný text záznamu

Wonderland

Autor: Kang Chen, Xuehai Qian, Mingxing Zhang, Chengying Huan, Yongwei Wu, Youwei Zhuo

Publikováno v: ASPLOS

Many important graph applications are iterative algorithms that repeatedly process the input graph until convergence. For such algorithms, graph abstraction is an important technique: although much smaller than the original graph, it can bootstrap an

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::97f10e0a6f34bc377c07ea44e07ef0ac
https://doi.org/10.1145/3173162.3173208

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání