Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Pouya Haghi"'
Autor:
Po Hao Chen, Pouya Haghi, Jae Yoon Chung, Tong Geng, Richard West, Anthony Skjellum, Martin C. Herbordt
Publikováno v:
2022 IEEE High Performance Extreme Computing Conference (HPEC).
Autor:
Anqi Guo, Tong Geng, Yongan Zhang, Pouya Haghi, Chunshu Wu, Cheng Tan, Yingyan Lin, Ang Li, Martin Herbordt
Publikováno v:
2022 32nd International Conference on Field-Programmable Logic and Applications (FPL).
Autor:
Rushi Patel, Pouya Haghi, Shweta Jain, Andriy Kot, Venkata Krishnan, Mayank Varia, Martin Herbordt
Publikováno v:
2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
Autor:
Anqi Guo, Tong Geng, Yongan Zhang, Pouya Haghi, Chunshu Wu, Cheng Tan, Yingyan Lin, Ang Li, Martin Herbordt
Publikováno v:
2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
Publikováno v:
IEEE Transactions on Circuits and Systems I: Regular Papers. 67:3056-3069
In this paper, we propose O4-DNN, a high-performance FPGA-based architecture for convolutional neural network (CNN) accelerators relying on o peration packing and o ut- o f- o rder ( OoO ) execution for DSP blocks augmented with LUT-based glue logic.
Autor:
Pouya Haghi, Anqi Guo, Qingqing Xiong, Chen Yang, Tong Geng, Justin T. Broaddus, Ryan Marshall, Derek Schafer, Anthony Skjellum, Martin C. Herbordt
Publikováno v:
Concurrency and Computation: Practice and Experience. 34
Publikováno v:
2021 IEEE High Performance Extreme Computing Conference (HPEC).
Autor:
Tong Geng, Chunshu Wu, Cheng Tan, Chenhao Xie, Anqi Guo, Pouya Haghi, Sarah Yuan He, Jiajia Li, Martin Herbordt, Ang Li
Publikováno v:
2021 IEEE High Performance Extreme Computing Conference (HPEC).
Autor:
Justin Broaddus, Martin C. Herbordt, Tong Geng, Derek Schafer, Pouya Haghi, Anqi Guo, Anthony Skjellum
Publikováno v:
FPT
Collectives are a fundamental part of HPC applications and their optimization has undergone decades of study. In recent years collectives have been accelerated with in-network hardware support, initially in the NIC, but recently also in the switch. T
Autor:
Rushi Patel, Tong Geng, Anqi Guo, Qingqing Xiong, Ryan Marshall, Chen Yang, Justin T. Broaddus, Martin C. Herbordt, Pouya Haghi, Anthony Skjellum
Publikováno v:
HPEC
MPI collective operations can often be performance killers in HPC applications; we seek to solve this bottleneck by offloading them to reconfigurable hardware within the switch itself, rather than, e.g., the NIC. We have designed a hardware accelerat