A Hypervisor for Shared-Memory FPGA Platforms
Autor: | Gefei Zuo, Zhengwei Qi, Xiaohe Cheng, Yanqiang Liu, Baris Kasikci, Kevin Loughlin, Abel Mulugeta Eneyew, Jiacheng Ma |
---|---|
Rok vydání: | 2020 |
Předmět: |
010302 applied physics
Address space Computer science business.industry Hypervisor Cloud computing 02 engineering and technology computer.software_genre Virtualization 01 natural sciences 020202 computer hardware & architecture Shared memory 0103 physical sciences Scalability 0202 electrical engineering electronic engineering information engineering Programming paradigm Operating system Page table business computer |
Zdroj: | ASPLOS |
DOI: | 10.1145/3373376.3378482 |
Popis: | Cloud providers widely deploy FPGAs as application-specific accelerators for customer use. These providers seek to multiplex their FPGAs among customers via virtualization, thereby reducing running costs. Unfortunately, most virtualization support is confined to FPGAs that expose a restrictive, host-centric programming model in which accelerators cannot issue direct memory accesses (DMAs). The host-centric model incurs high runtime overhead for workloads that exhibit pointer chasing. Thus, FPGAs are beginning to support a shared-memory programming model in which accelerators can issue DMAs. However, virtualization support for shared-memory FPGAs is limited. This paper presents Optimus, the first hypervisor that supports scalable shared-memory FPGA virtualization. Optimus offers both spatial multiplexing and temporal multiplexing to provide efficient and flexible sharing of each accelerator on an FPGA. To share the FPGA-CPU interconnect at a high clock frequency, Optimus implements a multiplexer tree. To isolate each guest's address space, Optimus introduces the technique of page table slicing as a hardware-software co-design. To support preemptive temporal multiplexing, Optimus provides an accelerator preemption interface. We show that Optimus supports eight physical accelerators on a single FPGA and improves the aggregate throughput of twelve real-world benchmarks by 1.98x-7x. |
Databáze: | OpenAIRE |
Externí odkaz: |