Popis: |
The amount of data generated by applications and digital sources is rising to unprecedented scales. To keep pace, applications and workflows tasked with transforming the data to insight are becoming increasingly dynamic and inherently data-driven. Furthermore, the computational services, i.e., compute, data, and communication, required to run this emerging class of applications are often just as dynamic and heterogeneous. As data sizes continue to grow, one must find new ways of harnessing these services to meet the needs of emerging data-driven workloads. Building a computational environment capable of supporting these applications presents many complex challenges. For example, there are requirements and dynamic behaviors set forth by multiple components of the environment, i.e., users, service providers, applications, and computational services. Accordingly, an environment must be capable of (1) providing a way for these components to express their requirements at any time in the application lifecycle and (2) reacting in real-time to changes set forth by any of these components by adjusting the service composition. While cloud computing provides the flexibility and diversity of services required by such an environment, determining which services to compose to meet application needs and when to compose them is not supported by current service models and infrastructure. To address these challenges, this dissertation presents a programming system to enable the creation of a distributed Software-Defined Environment (dSDE); the resulting environment can seamlessly and symbiotically combine compute, data sources, data storage, and network resources. Specifically, this work makes the following contributions: (1) it enables the on-demand aggregation of distributed services while facilitating the continuous deployment of applications on top of them; (2) it provides programming abstractions that allow users, resource providers, and applications to dynamically compose different services based on constraints or requirements; (3) it introduces a runtime framework that can autonomously adapt to changes from any of the components in the environment; and (4) it sets forth a quantification model for application performance and expected quality of service of the resulting distributed Software-Defined Environment, which allows users to reason about trade-offs and requirements with respect to throughput, latency, cost, deadline, etc. The applicability of this work to real-world scientific applications is validated through a series of experiments where heterogeneous, geographically distributed services are composed based on user, resource provider, and application specifications. The results establish the potential impact of a system capable of real-time adaptability to changes in mixed resource environments, including multiple clouds, grids, clusters, supercomputers, and traditional data centers. |