IOctopus
Autor: | Austin Bolen, Igor Smolyar, Alex Markuze, Dan Tsafrir, Gerd Zellweger, Haggai Eran, Boris Pismenny, Liran Liss, Adam Morrison |
---|---|
Rok vydání: | 2020 |
Předmět: |
010302 applied physics
Hardware_MEMORYSTRUCTURES business.industry Firmware Computer science 02 engineering and technology computer.software_genre 01 natural sciences 020204 information systems Embedded system 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Overhead (computing) Central processing unit Latency (engineering) business Throughput (business) computer PCI Express System software |
Zdroj: | ASPLOS |
DOI: | 10.1145/3373376.3378509 |
Popis: | In a multi-CPU server, memory modules are local to the CPU to which they are connected, forming a nonuniform memory access (NUMA) architecture. Because non-local accesses are slower than local accesses, the NUMA architecture might degrade application performance. Similar slowdowns occur when an I/O device issues nonuniform DMA (NUDMA) operations, as the device is connected to memory via a single CPU. NUDMA effects therefore degrade application performance similarly to NUMA effects. We observe that the similarity is not inherent but rather a product of disregarding the intrinsic differences between I/O and CPU memory accesses. Whereas NUMA effects are inevitable, we show that NUDMA effects can and should be eliminated. We present IOctopus, a device architecture that makes NUDMA impossible by unifying multiple physical PCIe functions-one per CPU-in manner that makes them appear as one, both to the system software and externally to the server. IOctopus requires only a modest change to the device driver and firmware. We implement it on existing hardware and demonstrate that it improves throughput and latency by as much as 2.7x and 1.28x, respectively, while ridding developers from the need to combat (what appeared to be) an unavoidable type of overhead. |
Databáze: | OpenAIRE |
Externí odkaz: |