Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Macke, Stephen"'
Autor:
Macke, Stephen, Gong, Hongpu, Lee, Doris Jung-Lin, Head, Andrew, Xin, Doris, Parameswaran, Aditya
Computational notebooks have emerged as the platform of choice for data science and analytical workflows, enabling rapid iteration and exploration. By keeping intermediate program state in memory and segmenting units of execution into so-called "cell
Externí odkaz:
http://arxiv.org/abs/2012.06981
Autor:
Macke, Stephen, Aliakbarpour, Maryam, Diakonikolas, Ilias, Parameswaran, Aditya, Rubinfeld, Ronitt
Aggregating data is fundamental to data analytics, data exploration, and OLAP. Approximate query processing (AQP) techniques are often used to accelerate computation of aggregates using samples, for which confidence intervals (CIs) are widely used to
Externí odkaz:
http://arxiv.org/abs/2008.03891
Autor:
Petersohn, Devin, Macke, Stephen, Xin, Doris, Ma, William, Lee, Doris, Mo, Xiangxi, Gonzalez, Joseph E., Hellerstein, Joseph M., Joseph, Anthony D., Parameswaran, Aditya
Dataframes are a popular abstraction to represent, prepare, and analyze data. Despite the remarkable success of dataframe libraries in Rand Python, dataframes face performance issues even on moderately large datasets. Moreover, there is significant a
Externí odkaz:
http://arxiv.org/abs/2001.00888
Machine learning workflow development is a process of trial-and-error: developers iterate on workflows by testing out small modifications until the desired accuracy is achieved. Unfortunately, existing machine learning systems focus narrowly on model
Externí odkaz:
http://arxiv.org/abs/1812.05762
Data application developers and data scientists spend an inordinate amount of time iterating on machine learning (ML) workflows -- by modifying the data pre-processing, model training, and post-processing steps -- via trial-and-error to achieve the d
Externí odkaz:
http://arxiv.org/abs/1808.01095
Development of machine learning (ML) workflows is a tedious process of iterative experimentation: developers repeatedly make changes to workflows until the desired accuracy is attained. We describe our vision for a "human-in-the-loop" ML system that
Externí odkaz:
http://arxiv.org/abs/1804.05892
This paper addresses the Data-Diff problem: given a dataset and a subsequent version of the dataset, find the shortest sequence of operations that transforms the dataset to the subsequent version, under a restricted family of operations. We consider
Externí odkaz:
http://arxiv.org/abs/1801.06258
In exploratory data analysis, analysts often have a need to identify histograms that possess a specific distribution, among a large class of candidate histograms, e.g., find countries whose income distribution is most similar to that of Greece. This
Externí odkaz:
http://arxiv.org/abs/1708.05918
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.