Partially shared cache and adaptive replacement algorithm for NoC-based many-core systems
Autor: | Hongwei Ye, Quan Wang, Zhiqiang Zhang, Pengfei Yang |
---|---|
Rok vydání: | 2019 |
Předmět: |
010302 applied physics
Instructions per cycle Hardware_MEMORYSTRUCTURES 060102 archaeology CPU cache Computer science Node (networking) 06 humanities and the arts 01 natural sciences Shared memory Hardware and Architecture 0103 physical sciences Scalability 0601 history and archaeology Local bus Cache Algorithm Throughput (business) Software |
Zdroj: | Journal of Systems Architecture. 98:424-433 |
ISSN: | 1383-7621 |
DOI: | 10.1016/j.sysarc.2019.05.002 |
Popis: | The Network-on-Chip(NoC) is a promising alternative to traditional bus-based architectures that has been widely applied to interconnect multi/many-core systems due to its scalable and modular design. Undoubtedly, the memory wall problem is one of the most important challenges; however, this problem can now be somewhat be alleviated by cache subsystems. In this paper, to overcome the high resource consumption and low data-sharing rate problems of the private cache scheme, we propose a partially shared cache structure and a corresponding replacement algorithm based on a mesh NoC. In this scheme, the L2 cache is shared by each group of four cores that connected as a cluster to a given node by the local bus. To maximize the performance of this partially shared cache structure, we propose a core-aware re-reference interval prediction (CA-RRIP) replacement algorithm. The algorithm performs dynamic virtual partitioning on the partially shared cache; the core that initiated the cache access request will be given top priority when a cache area needs to be replaced or inserted. This approach guarantees cache exclusivity and can mitigate interactions among cores using different access patterns. We implement the traditional private, the proposed partially shared and the row-shared cache subsystems in our experiments. The comparisons indicate that the overall system resource occupation can be reduced by 20% with the same number of cores, and the instructions per cycle(IPC) of the system could increase by up to 49.2%. Moreover, the system throughput(STP) increased by an average of 5.89%. Our experimental results showed that the proposed CA-RRIP algorithm also reduces the average cache miss rate of the system under various cache access patterns. |
Databáze: | OpenAIRE |
Externí odkaz: |