Elastic Machine Learning Algorithms in Amazon SageMaker

Autor:	Amir Sadoughi, Sebastian Schelter, Julio Delgado, Valentin Flunkert, Madhav Jha, Edo Liberty, Bing Xiang, Ramesh Nallapati, Syama Sundar Rangapuram, Lorenzo Stella, David Arpin, Jan Gasthaus, Yuyang Wang, Yury Astashonok, David Salinas, Zohar Karnin, Can Balioglu, Baris Coskun, Philip Gautier, Saswata Chakravarty, Laurence Rouesnel, Piali Das, Alexander J. Smola, Tim Januschowski
Rok vydání:	2020
Předmět:	Computer science business.industry Amazon rainforest Computation 02 engineering and technology 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Hyperparameter optimization 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business computer Algorithm 0105 earth and related environmental sciences
Zdroj:	SIGMOD Conference
DOI:	10.1145/3318464.3386126
Popis:	There is a large body of research on scalable machine learning (ML). Nevertheless, training ML models on large, continuously evolving datasets is still a difficult and costly undertaking for many companies and institutions. We discuss such challenges and derive requirements for an industrial-scale ML platform. Next, we describe the computational model behind Amazon SageMaker, which is designed to meet such challenges. SageMaker is an ML platform provided as part of Amazon Web Services (AWS), and supports incremental training, resumable and elastic learning as well as automatic hyperparameter optimization. We detail how to adapt several popular ML algorithms to its computational model. Finally, we present an experimental evaluation on large datasets, comparing SageMaker to several scalable, JVM-based implementations of ML algorithms, which we significantly outperform with regard to computation time and cost.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::e769140fd8e6cb1552bc1125581e3615 https://doi.org/10.1145/3318464.3386126 Zobrazit plný text záznamu