Popis: |
In the last decade, life science applications have become more and more integrated into e-Science environments, hence they are typically very demanding, both in terms of computational capabilities and data capacities. Especially the access to life science applications, embedded in such environments via Grid clients still constitutes a major hurdle for scientists that do not have an IT background. Life science applications often comprise a whole set of small programs instead of a single executable. Many of the graphical Grid clients are not perfectly suited for these types of applications, as they often assume that Grid jobs will run a single executable instead of a set of chained executions (i.e. sequences). This means that in order to execute a sequence of multiple programs on a single Grid resource, piping data from one program to the next, the user would have to run a hand-written shell script. Otherwise each program is independently scheduled as a Grid job, which causes unnecessary file transfers between the jobs, even if they are scheduled on the same resource. We present a generic solution to this problem and provide a reference implementation, which seamlessly integrates with the Grid middleware UNICORE. Our approach focuses on a comfortable user interface for the creation of such program sequences, validated in UNICORE-driven HPC-based Grids. Thus, we applied our approach in order to provide support for the usage of the AMBER package (a widely-used collection of programs for molecular dynamics simulations) within Grid workflows. We finally provide a scientific use case of our approach leveraging the interoperability of two different scientific infrastructures that represents an instance of the infrastructure interoperability reference model. |