Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory
Autor: | Saurabh Kadekodi, Charles D. Cranor, George Amvrosiadis, Gary Grider, Garth A. Gibson, Bradley W. Settlemyer, Qing Zheng, Fan Guo |
---|---|
Rok vydání: | 2017 |
Předmět: |
File system
020203 distributed computing Speedup Computer science Search engine indexing 02 engineering and technology Directory Parallel computing computer.software_genre Node (computer science) Scalability Data_FILES 0202 electrical engineering electronic engineering information engineering Overhead (computing) 020201 artificial intelligence & image processing computer Software-defined storage |
Zdroj: | PDSW-DISCS@SC |
Popis: | In this paper we introduce the Indexed Massive Directory, a new technique for indexing data within DeltaFS. With its design as a scalable, server-less file system for HPC platforms, DeltaFS scales file system metadata performance with application scale. The Indexed Massive Directory is a novel extension to the DeltaFS data plane, enabling in-situ indexing of massive amounts of data written to a single directory simultaneously, and in an arbitrarily large number of files. We achieve this through a memory-efficient indexing mechanism for reordering and indexing data, and a log-structured storage layout to pack small writes into large log objects, all while ensuring compute node resources are used frugally. We demonstrate the efficiency of this indexing mechanism through VPIC, a widely-used simulation code that scales to trillions of particles. With DeltaFS, we modify VPIC to create a file for each particle to receive writes of that particle's output data. Dynamically indexing the directory's underlying storage allows us to achieve a 5000x speedup in single particle trajectory queries, which require reading all data for a single particle. This speedup increases with application scale while the overhead is fixed at 3% of available memory. |
Databáze: | OpenAIRE |
Externí odkaz: |