Popis: |
The nature of data for scientific computation is very diverse in the age of big data. First, it may be available at a number of locations, e.g. the scientist’s machine, some institutional filesystem, a remote service, or some sort of database. Second, the size of the data may vary from a few kilobytes to many terabytes. In order to be available for computation, data has to be transferred to the location where the computation takes place. This requires a diverse set of middleware tools that are compatible both with the data and the compute resources. However, using this tools requires additional knowledge and makes running the experiments an inconvenient task. In this paper we present the Data Bridge, a high-level service that can be used easily in scientific computations to perform data transfer to and from a diverse set of storage services. The Data Bridge not only unifies access to different types of storage services, but it can also be used at different levels (e.g., single jobs, parameter sweeps, scientific workflows) in scientific computations. |