Flexible ingest framework: A scalable architecture for dynamic routing through composable pipelines

Autor: Alexei Samoylov, Jason Schlachter
Rok vydání: 2015
Předmět:
Zdroj: IEEE BigData
DOI: 10.1109/bigdata.2015.7364097
Popis: In this paper we describe a flexible and scalable big data ingestion framework based on Apache Spark. It is flexible in that meta-information about the data is used to build custom processing pipelines at run-time. It is scalable in that it leverages Apache Spark with minimal additional overhead. These capabilities allow a user to setup custom big data processing pipelines capable of handling changing data types without the need to recompile code in an operational environment. This is particularly advantageous in secure environments where recompilation is undesirable or unattainable.
Databáze: OpenAIRE