Scalable Data-Intensive Geocomputation: A Design for Real-Time Continental Flood Inundation Mapping

Autor:	Jibonananda Sanyal, Yan Y. Liu
Rok vydání:	2020
Předmět:	Interactive computing 020203 distributed computing Geographic information system Geospatial analysis business.industry Computer science Data stream mining Distributed computing 0207 environmental engineering 02 engineering and technology computer.software_genre Data modeling Workflow Scalability 0202 electrical engineering electronic engineering information engineering 020701 environmental engineering business Spatial analysis computer
Zdroj:	Communications in Computer and Information Science ISBN: 9783030633929 SMC
DOI:	10.1007/978-3-030-63393-6_9
Popis:	The convergence of data-intensive and extreme-scale computing behooves an integrated software and data ecosystem for scientific discovery. Developments in this realm will fuel transformative research in data-driven interdisciplinary domains. Geocomputation provides computing paradigms in Geographic Information Systems (GIS) for interactive computing of geographic data, processes, models, and maps. Because GIS is data-driven, the computational scalability of a geocomputation workflow is directly related to the scale of the GIS data layers, their resolution and extent, as well as the velocity of the geo-located data streams to be processed. Geocomputation applications, which have high user interactivity and low end-to-end latency requirements, will dramatically benefit from the convergence of high-end data analytics (HDA) and high-performance computing (HPC). In an application, we must identify and eliminate computational bottlenecks that arise in a geocomputation workflow. Indeed, poor scalability at any of the workflow components is detrimental to the entire end-to-end pipeline. Here, we study a large geocomputation use case in flood inundation mapping that handles multiple national-scale geospatial datasets and targets low end-to-end latency. We discuss the benefits and challenges for harnessing both HDA and HPC for data-intensive geospatial data processing and intensive numerical modeling of geographic processes. We propose an HDA+HPC geocomputation architecture design that couples HDA (e.g., Spark)-based spatial data handling and HPC-based parallel data modeling. Key techniques for coupling HDA and HPC to bridge the two different software stacks are reviewed and discussed.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::86afb25cbbd4907b4f9fcc98ef894d94 https://doi.org/10.1007/978-3-030-63393-6_9 Zobrazit plný text záznamu