Zobrazeno 1 - 10
of 64
pro vyhledávání: '"Park, Yongjoo"'
Computational notebooks (e.g., Jupyter, Google Colab) are widely used by data scientists. A key feature of notebooks is the interactive computing model of iteratively executing cells (i.e., a set of statements) and observing the result (e.g., model o
Externí odkaz:
http://arxiv.org/abs/2406.13856
Computational notebooks (e.g., Jupyter, Google Colab) are widely used for interactive data science and machine learning. In those frameworks, users can start a session, then execute cells (i.e., a set of statements) to create variables, train models,
Externí odkaz:
http://arxiv.org/abs/2309.11083
The end-to-end lookup latency of a hierarchical index -- such as a B-tree or a learned index -- is determined by its structure such as the number of layers, the kinds of branching functions appearing in each layer, the amount of data we must fetch fr
Externí odkaz:
http://arxiv.org/abs/2306.14395
In machine learning (ML), Python serves as a convenient abstraction for working with key libraries such as PyTorch, scikit-learn, and others. Unlike DBMS, however, Python applications may lose important data, such as trained models and extracted feat
Externí odkaz:
http://arxiv.org/abs/2305.08770
With data pipeline tools and the expressiveness of SQL, managing interdependent materialized views (MVs) are becoming increasingly easy. These MVs are updated repeatedly upon new data ingestion (e.g., daily), from which database admins can observe pe
Externí odkaz:
http://arxiv.org/abs/2303.09774
Autor:
Sheoran, Nikhil, Chockchowwat, Supawit, Chheda, Arav, Wang, Suwen, Verma, Riya, Park, Yongjoo
For exploratory data analysis, it is often desirable to know what answers you are likely to get before actually obtaining those answers. This can potentially be achieved by designing systems to offer the estimates of a data operation result -- say op
Externí odkaz:
http://arxiv.org/abs/2303.04103
Existing learned indexes (e.g., RMI, ALEX, PGM) optimize the internal regressor of each node, not the overall structure such as index height, the size of each layer, etc. In this paper, we share our recent findings that we can achieve significantly f
Externí odkaz:
http://arxiv.org/abs/2208.03823
Autor:
Park, Chae-Yeon, Lin Yang, Hae, Kim, Hye-Mi, Kim, Daejung, Park, Yongjoo, Park, Jongruyl, Shin, Seokhee, Park, Jin-Seong
Publikováno v:
In Applied Surface Science 15 October 2024 670
Modern data warehouses can scale compute nodes independently of storage. These systems persist their data on cloud storage, which is always available and cost-efficient. Ad-hoc compute nodes then fetch necessary data on-demand from cloud storage. Thi
Externí odkaz:
http://arxiv.org/abs/2112.13323
Publikováno v:
In Ceramics International 1 November 2024 50(21) Part A:41483-41489