Zobrazeno 1 - 10
of 275
pro vyhledávání: '"Hellerstein, Joseph M"'
Streaming systems are present throughout modern applications, processing continuous data in real-time. Existing streaming languages have a variety of semantic models and guarantees that are often incompatible. Yet all these languages are considered "
Externí odkaz:
http://arxiv.org/abs/2411.08274
Autor:
Garcia, Rolando, Kallanagoudar, Pragya, Anand, Chithra, Chasins, Sarah E., Hellerstein, Joseph M., Kerrison, Erin Michelle Turner, Parameswaran, Aditya G.
In this paper we present techniques to incrementally harvest and query arbitrary metadata from machine learning pipelines, without disrupting agile practices. We center our approach on the developer-favored technique for generating metadata -- log st
Externí odkaz:
http://arxiv.org/abs/2408.02498
Programming models for distributed dataflow have long focused on analytical workloads that allow the runtime to dynamically place and schedule compute logic. Meanwhile, models that enable fine-grained control over placement, such as actors, make glob
Externí odkaz:
http://arxiv.org/abs/2406.14733
Publikováno v:
Proc. ACM Hum.-Comput. Interact. 8, CSCW1, Article 206 (April 2024)
Organizations rely on machine learning engineers (MLEs) to deploy models and maintain ML pipelines in production. Due to models' extensive reliance on fresh data, the operationalization of machine learning, or MLOps, requires MLEs to have proficiency
Externí odkaz:
http://arxiv.org/abs/2403.16795
Autor:
Chu, David, Panchapakesan, Rithvik, Laddad, Shadaj, Katahanas, Lucky, Liu, Chris, Shivakumar, Kaushik, Crooks, Natacha, Hellerstein, Joseph M., Howard, Heidi
Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. New scalable distributed protocols are developed through careful analysis and rewrites, but this process is ad hoc an
Externí odkaz:
http://arxiv.org/abs/2404.01593
Autor:
Garcia, Rolando, Dandamudi, Anusha, Matute, Gabriel, Wan, Lehan, Gonzalez, Joseph, Hellerstein, Joseph M., Sen, Koushik
Production Machine Learning involves continuous training: hosting multiple versions of models over time, often with many model versions running at once. When model performance does not meet expectations, Machine Learning Engineers (MLEs) debug issues
Externí odkaz:
http://arxiv.org/abs/2310.07898
We propose cloud oracles, an alternative to machine learning for online optimization of cloud configurations. Our cloud oracle approach guarantees complete accuracy and explainability of decisions for problems that can be formulated as parametric con
Externí odkaz:
http://arxiv.org/abs/2308.06815
Optimizing a stateful dataflow language is a challenging task. There are strict correctness constraints for preserving properties expected by downstream consumers, a large space of possible optimizations, and complex analyses that must reason about t
Externí odkaz:
http://arxiv.org/abs/2306.10585
Publikováno v:
The 5th workshop on Advanced tools, program- ming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems (ApPLIED 2023), June 19, 2023, Orlando, FL, USA
In the Hydro project we are designing a compiler toolkit that can optimize for the concerns of distributed systems, including scale-up and scale-down, availability, and consistency of outcomes across replicas. This invited paper overviews the project
Externí odkaz:
http://arxiv.org/abs/2305.14614
Autor:
Laddad, Shadaj, Power, Conor, Milano, Mae, Cheung, Alvin, Crooks, Natacha, Hellerstein, Joseph M.
Despite decades of research and practical experience, developers have few tools for programming reliable distributed applications without resorting to expensive coordination techniques. Conflict-free replicated datatypes (CRDTs) are a promising line
Externí odkaz:
http://arxiv.org/abs/2210.12605