Zobrazeno 1 - 10
of 177
pro vyhledávání: '"Bailis, Peter"'
Autor:
Davis, Jared Quincy, Hanin, Boris, Chen, Lingjiao, Bailis, Peter, Stoica, Ion, Zaharia, Matei
As practitioners seek to surpass the current reliability and quality frontier of monolithic models, Compound AI Systems consisting of many language model inference calls are increasingly employed. In this work, we construct systems, which we call Net
Externí odkaz:
http://arxiv.org/abs/2407.16831
Autor:
Chen, Lingjiao, Davis, Jared Quincy, Hanin, Boris, Bailis, Peter, Stoica, Ion, Zaharia, Matei, Zou, James
Many recent state-of-the-art results in language tasks were achieved using compound systems that perform multiple Language Model (LM) calls and aggregate their responses. However, there is little understanding of how the number of LM calls - e.g., wh
Externí odkaz:
http://arxiv.org/abs/2403.02419
Autoregressive decoding of large language models (LLMs) is memory bandwidth bounded, resulting in high latency and significant wastes of the parallel processing power of modern accelerators. Existing methods for accelerating LLM decoding often requir
Externí odkaz:
http://arxiv.org/abs/2402.02057
Autor:
Liu, Xiaoxuan, Hu, Lanxiang, Bailis, Peter, Cheung, Alvin, Deng, Zhijie, Stoica, Ion, Zhang, Hao
Speculative decoding is a pivotal technique to accelerate the inference of large language models (LLMs) by employing a smaller draft model to predict the target model's outputs. However, its efficacy can be limited due to the low predictive accuracy
Externí odkaz:
http://arxiv.org/abs/2310.07177
Autor:
Kraft, Peter, Li, Qian, Kaffes, Kostis, Skiadopoulos, Athinagoras, Kumar, Deeptaanshu, Cho, Danny, Li, Jason, Redmond, Robert, Weckwerth, Nathan, Xia, Brian, Bailis, Peter, Cafarella, Michael, Graefe, Goetz, Kepner, Jeremy, Kozyrakis, Christos, Stonebraker, Michael, Suresh, Lalith, Yu, Xiangyao, Zaharia, Matei
Developers increasingly use function-as-a-service (FaaS) platforms for data-centric applications that perform low-latency and transactional operations on data, such as for microservices or web serving. Unfortunately, existing FaaS platforms support t
Externí odkaz:
http://arxiv.org/abs/2208.13068
Publikováno v:
SIGMOD 2022
ML is being deployed in complex, real-world scenarios where errors have impactful consequences. In these systems, thorough testing of the ML pipelines is critical. A key component in ML deployment pipelines is the curation of labeled training data. C
Externí odkaz:
http://arxiv.org/abs/2201.05797
Earthquake monitoring by seismic networks typically involves a workflow consisting of phase detection/picking, association, and location tasks. In recent years, the accuracy of these individual stages has been improved through the use of machine lear
Externí odkaz:
http://arxiv.org/abs/2109.09911
Publikováno v:
PVLDB, 14(11): 2341 - 2354, 2021
Researchers and industry analysts are increasingly interested in computing aggregation queries over large, unstructured datasets with selective predicates that are computed using expensive deep neural networks (DNNs). As these DNNs are expensive and
Externí odkaz:
http://arxiv.org/abs/2108.06313
Given a dataset $\mathcal{D}$, we are interested in computing the mean of a subset of $\mathcal{D}$ which matches a predicate. ABae leverages stratified sampling and proxy models to efficiently compute this statistic given a sampling budget $N$. In t
Externí odkaz:
http://arxiv.org/abs/2107.12525
Self-training is a standard approach to semi-supervised learning where the learner's own predictions on unlabeled data are used as supervision during training. In this paper, we reinterpret this label assignment process as an optimal transportation p
Externí odkaz:
http://arxiv.org/abs/2102.08622