Zobrazeno 1 - 10
of 332
pro vyhledávání: '"Unsal, Osman"'
Autor:
Boixaderas, Isaac, Amaya, Jorge, Moré, Sergi, Bartolome, Javier, Vicente, David, Unsal, Osman, Gizopoulos, Dimitris, Carpenter, Paul M., Radojković, Petar, Ayguadé, Eduard
It is widely accepted that cosmic rays are a plausible cause of DRAM errors in high-performance computing (HPC) systems, and various studies suggest that they could explain some aspects of the observed DRAM error behavior. However, this phenomenon is
Externí odkaz:
http://arxiv.org/abs/2407.16487
Autor:
Lazo, Cristóbal Ramírez, Reggiani, Enrico, Morales, Carlos Rojas, Bagué, Roger Figueras, Vargas, Luis Alfonso Villa, Salinas, Marco Antonio Ramírez, Cortés, Mateo Valero, Unsal, Osman Sabri, Cristal, Adrián
Modern scientific applications are getting more diverse, and the vector lengths in those applications vary widely. Contemporary Vector Processors (VPs) are designed either for short vector lengths, e.g., Fujitsu A64FX with 512-bit ARM SVE vector supp
Externí odkaz:
http://arxiv.org/abs/2111.05301
Autor:
Lazo, Cristóbal Ramírez, Hernández, César Alejandro, Palomar, Oscar, Unsal, Osman Sabri, Ramírez, Marco Antonio, Cristal, Adrían
Vector architectures lack tools for research. Consider the gem5 simulator, which is possibly the leading platform for computer-system architecture research. Unfortunately, gem5 does not have an available distribution that includes a flexible and cust
Externí odkaz:
http://arxiv.org/abs/2111.01949
Autor:
Göttel, Christian, Parasyris, Konstantinos, Unsal, Osman, Felber, Pascal, Pasin, Marcelo, Schiavoni, Valerio
Publikováno v:
2021 40th International Symposium on Reliable Distributed Systems (SRDS) (2021) 187-197
Latest ARM processors are approaching the computational power of x86 architectures while consuming much less energy. Consequently, supply follows demand with Amazon EC2, Equinix Metal and Microsoft Azure offering ARM-based instances, while Oracle Clo
Externí odkaz:
http://arxiv.org/abs/2107.00416
Autor:
Larimi, Seyed Saber Nabavi, Salami, Behzad, Unsal, Osman S., Kestelman, Adrian Cristal, Sarbazi-Azad, Hamid, Mutlu, Onur
Modern computing devices employ High-Bandwidth Memory (HBM) to meet their memory bandwidth requirements. An HBM-enabled device consists of multiple DRAM layers stacked on top of one another next to a compute chip (e.g. CPU, GPU, and FPGA) in the same
Externí odkaz:
http://arxiv.org/abs/2101.00969
Autor:
Papadimitriou, George, Chatzidimitriou, Athanasios, Gizopoulos, Dimitris, Reddi, Vijay Janapa, Leng, Jingwen, Salami, Behzad, Unsal, Osman S., Kestelman, Adrian Cristal
Modern large-scale computing systems (data centers, supercomputers, cloud and edge setups and high-end cyber-physical systems) employ heterogeneous architectures that consist of multicore CPUs, general-purpose many-core GPUs, and programmable FPGAs.
Externí odkaz:
http://arxiv.org/abs/2006.01049
In this paper, we exploit the aggressive supply voltage underscaling technique in Block RAMs (BRAMs) of Field Programmable Gate Arrays (FPGAs) to improve the energy efficiency of Multi-Layer Perceptrons (MLPs). Additionally, we evaluate and improve t
Externí odkaz:
http://arxiv.org/abs/2005.04737
Autor:
Salami, Behzad, Onural, Erhan Baturay, Yuksel, Ismail Emir, Koc, Fahrettin, Ergin, Oguz, Kestelman, Adrian Cristal, Unsal, Osman S., Sarbazi-Azad, Hamid, Mutlu, Onur
We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPG
Externí odkaz:
http://arxiv.org/abs/2005.03451
Autor:
Givaki, Kamyar, Salami, Behzad, Hojabr, Reza, Tayaranian, S. M. Reza, Khonsari, Ahmad, Rahmati, Dara, Gorgin, Saeid, Cristal, Adrian, Unsal, Osman S.
Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Perfo
Externí odkaz:
http://arxiv.org/abs/2001.00053
This paper presents a deeply pipelined and massively parallel Binary Search Tree (BST) accelerator for Field Programmable Gate Arrays (FPGAs). Our design relies on the extremely parallel on-chip memory, or Block RAMs (BRAMs) architecture of FPGAs. To
Externí odkaz:
http://arxiv.org/abs/1912.01556