In-database distributed machine learning

Autor:	Mohammed Al-Kateb, Mani Srivastava, Sanjay Nair, Wellington Cabrera, Sandeep Singh Sandha
Rok vydání:	2019
Předmět:	SQL 010504 meteorology & atmospheric sciences Database Artificial neural network business.industry Computer science Big data General Engineering Python (programming language) 010502 geochemistry & geophysics computer.software_genre Machine learning 01 natural sciences Bottleneck Overhead (computing) Artificial intelligence business computer 0105 earth and related environmental sciences computer.programming_language
Zdroj:	Proceedings of the VLDB Endowment. 12:1854-1857
ISSN:	2150-8097
DOI:	10.14778/3352063.3352083
Popis:	Machine learning has enabled many interesting applications and is extensively being used in big data systems. The popular approach - training machine learning models in frameworks like Tensorflow, Pytorch and Keras - requires movement of data from database engines to analytical engines, which adds an excessive overhead on data scientists and becomes a performance bottleneck for model training. In this demonstration, we give a practical exhibition of a solution for the enablement of distributed machine learning natively inside database engines. During the demo, the audience will interactively use Python APIs in Jupyter Notebooks to train multiple linear regression models on synthetic regression datasets and neural network models on vision and sensory datasets directly inside Teradata SQL Engine.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::0c941772c4f311b7c2a4ed5bda9e9e9e https://doi.org/10.14778/3352063.3352083 Zobrazit plný text záznamu