I/O separation scheme on Lustre metadata server based on multi-stream SSD.

Autor: Lee, Cheongjun, Lee, Jaehwan, kim, Chungyong, Bang, Jiwoo, Byun, Eun-Kyu, Eom, Hyeonsang
Předmět:
Zdroj: Cluster Computing; Oct2023, Vol. 26 Issue 5, p2883-2896, 14p
Abstrakt: As the price of NAND-flash storage decreases, large-scale backend distributed file systems are being constructed as all-flash storage without HDDs. In fact, the performance of an SSD can sharply decrease due to the internal garbage collection overhead along with write amplification. Lustre distributed file system provides Data-on-MDT (DoM) feature, which stores small files directly in Metadata Server instead of Object Storage Server. Despite of its benefit on communication traffic, DoM fills Metadata Target (MDT) much faster, causing garbage collection with write amplification and drastically reduces the performance of MDT. Also, DoM I/O uses the I/O bandwidth causing I/O bandwidth starvation of other metadata I/O jobs on MDS. We therefore propose two types of I/O separation scheme: Data separation for write amplification, I/O bandwidth separation for bandwidth starvation. We separate the physical placement of DoM data, normal metadata, and journaling data using multi-stream SSD. We also virtually isolated I/O resource of DoM I/O and metadata I/O by limiting the bandwidth of DoM I/O using Linux cgroup. Our schemes enhance the I/O throughput of MDT by 70%, IOPS by 81% preventing write amplification and provide a stable performance of metadata I/O on MDS. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index