Hardware Based Loop Optimization for CGRA Architectures

Autor:	Kevin Martin, Chilankamol Sunny, Satyajit Das, Philippe Coussy
Přispěvatelé:	Indian Institut of Technology [Palakkad] (ITT Palakkad), Equipe Hardware ARchitectures and CAD tools (Lab-STICC_ARCAD), Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance (Lab-STICC), École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS)-Université Bretagne Loire (UBL)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS)-Université Bretagne Loire (UBL)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT), Université de Bretagne Sud - Lorient (UBS Lorient), Université de Bretagne Sud (UBS)
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	010302 applied physics [INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR] Loop (graph theory) Loop optimization business.industry Computer science 02 engineering and technology Directed acyclic graph Supercomputer 01 natural sciences 020202 computer hardware & architecture Software 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Overhead (computing) business Nested loop join Computer hardware ComputingMilieux_MISCELLANEOUS Block (data storage)
Zdroj:	Applied Reconfigurable Computing. Architectures, Tools, and Applications Applied Reconfigurable Computing. Architectures, Tools, and Applications, Jun 2021, Rennes, France. pp.65-80, ⟨10.1007/978-3-030-79025-7_5⟩ Applied Reconfigurable Computing. Architectures, Tools, and Applications ISBN: 9783030790240 ARC
DOI:	10.1007/978-3-030-79025-7_5⟩
Popis:	With the increasing demand for high performance computing in application domains with stringent power budgets, coarse-grained reconfigurable array (CGRA) architectures have become a popular choice among researchers and manufacturers. Loops are the hot-spots of kernels running on CGRAs and hence several techniques have been devised to optimize the loop execution. However, works in this direction are predominantly software-based solutions. This paper addresses the optimization opportunities at a deeper level and introduces a hardware based loop control mechanism that can support arbitrarily nested loops up to four levels. Major contributions of this work are, a lightweight Hardware Loop Block (HLB) for CGRAs that eliminates control instruction overhead of loops and an acyclic graph transformation that removes loop branches from the application CDFG. When tested on a set of kernels chosen from various application domains, the design could achieve a maximum of 1.9\(\times \) and an average of 1.5\(\times \) speed-up against the conventional approach. The total number of instructions executed is reduced to half for almost all the kernels with an area and power consumption overhead of 2.6% and 0.8% respectively.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::936343a5f3c03725d894682148c97918 https://hal.archives-ouvertes.fr/hal-03345346 Zobrazit plný text záznamu