Zobrazeno 1 - 10
of 23
pro vyhledávání: '"Licht, Johannes de Fine"'
Autor:
Pelton, Blake, Sapek, Adam, Eguro, Ken, Lo, Daniel, Forin, Alessandro, Humphrey, Matt, Xi, Jinwen, Cox, David, Karandikar, Rajas, Licht, Johannes de Fine, Babin, Evgeny, Caulfield, Adrian, Burger, Doug
Digital systems are growing in importance and computing hardware is growing more heterogeneous. Hardware design, however, remains laborious and expensive, in part due to the limitations of conventional hardware description languages (HDLs) like VHDL
Externí odkaz:
http://arxiv.org/abs/2405.19514
Autor:
Jiang, Wenqi, Li, Shigang, Zhu, Yu, Licht, Johannes de Fine, He, Zhenhao, Shi, Runbin, Renggli, Cedric, Zhang, Shuai, Rekatsinas, Theodoros, Hoefler, Torsten, Alonso, Gustavo
Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluat
Externí odkaz:
http://arxiv.org/abs/2306.11182
Dataflow devices represent an avenue towards saving the control and data movement overhead of Load-Store Architectures. Various dataflow accelerators have been proposed, but how to efficiently schedule applications on such devices remains an open pro
Externí odkaz:
http://arxiv.org/abs/2306.02730
Autor:
Licht, Johannes de Fine, De Matteis, Tiziano, Ben-Nun, Tal, Kuster, Andreas, Rausch, Oliver, Burger, Manuel, Johnsen, Carl-Johannes, Hoefler, Torsten
Although high-level synthesis (HLS) tools have significantly improved programmer productivity over hardware description languages, developing for FPGAs remains tedious and error prone. Programmers must learn and implement a large set of vendor-specif
Externí odkaz:
http://arxiv.org/abs/2212.13768
Autor:
Johnsen, Carl-Johannes, De Matteis, Tiziano, Ben-Nun, Tal, Licht, Johannes de Fine, Hoefler, Torsten
The multi-pumping resource sharing technique can overcome the limitations commonly found in single-clocked FPGA designs by allowing hardware components to operate at a higher clock frequency than the surrounding system. However, this optimization can
Externí odkaz:
http://arxiv.org/abs/2210.04598
Autor:
Licht, Johannes de Fine, Pattison, Christopher A., Ziogas, Alexandros Nikolaos, Simmons-Duffin, David, Hoefler, Torsten
Numerical codes that require arbitrary precision floating point (APFP) numbers for their core computation are dominated by elementary arithmetic operations due to the super-linear complexity of multiplication in the number of mantissa bits. APFP comp
Externí odkaz:
http://arxiv.org/abs/2204.06256
Autor:
Calotoiu, Alexandru, Ben-Nun, Tal, Kwasniewski, Grzegorz, Licht, Johannes de Fine, Schneider, Timo, Schaad, Philipp, Hoefler, Torsten
C is the lingua franca of programming and almost any device can be programmed using C. However, programming mod-ern heterogeneous architectures such as multi-core CPUs and GPUs requires explicitly expressing parallelism as well as device-specific pro
Externí odkaz:
http://arxiv.org/abs/2112.11879
Autor:
Ziogas, Alexandros Nikolaos, Schneider, Timo, Ben-Nun, Tal, Calotoiu, Alexandru, De Matteis, Tiziano, Licht, Johannes de Fine, Lavarini, Luca, Hoefler, Torsten
Python has become the de facto language for scientific computing. Programming in Python is highly productive, mainly due to its rich science-oriented software ecosystem built around the NumPy module. As a result, the demand for Python support in High
Externí odkaz:
http://arxiv.org/abs/2107.00555
Autor:
Licht, Johannes de Fine, Kuster, Andreas, De Matteis, Tiziano, Ben-Nun, Tal, Hofer, Dominic, Hoefler, Torsten
Spatial computing devices have been shown to significantly accelerate stencil computations, but have so far relied on unrolling the iterative dimension of a single stencil operation to increase temporal locality. This work considers the general case
Externí odkaz:
http://arxiv.org/abs/2010.15218
Autor:
Besta, Maciej, Fischer, Marc, Ben-Nun, Tal, Stanojevic, Dimitri, Licht, Johannes De Fine, Hoefler, Torsten
Publikováno v:
Proceedings of the ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2020. Proceedings of the 27th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2019
Developing high-performance and energy-efficient algorithms for maximum matchings is becoming increasingly important in social network analysis, computational sciences, scheduling, and others. In this work, we propose the first maximum matching algor
Externí odkaz:
http://arxiv.org/abs/2010.14684