Zobrazeno 1 - 10
of 56 609
pro vyhledávání: '"Basil Been"'
Link Prediction(LP) is an essential task over Knowledge Graphs(KGs), traditionally focussed on using and predicting the relations between entities. Textual entity descriptions have already been shown to be valuable, but models that incorporate numeri
Externí odkaz:
http://arxiv.org/abs/2407.18241
Deep learning algorithms have been shown to be powerful in many communication network design problems, including that in automatic modulation classification. However, they are vulnerable to carefully crafted attacks called adversarial examples. Hence
Externí odkaz:
http://arxiv.org/abs/2407.06796
This work considers the decentralized successive convex approximation (SCA) method for minimizing stochastic non-convex objectives subject to convex constraints, along with possibly non-smooth convex regularizers. Although SCA has been widely applied
Externí odkaz:
http://arxiv.org/abs/2405.07100
Autor:
Golden, Alicia, Hsia, Samuel, Sun, Fei, Acun, Bilge, Hosmer, Basil, Lee, Yejin, DeVito, Zachary, Johnson, Jeff, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean
Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability dur
Externí odkaz:
http://arxiv.org/abs/2405.02803
Autor:
Saab, Khaled, Tu, Tao, Weng, Wei-Hung, Tanno, Ryutaro, Stutz, David, Wulczyn, Ellery, Zhang, Fan, Strother, Tim, Park, Chunjong, Vedadi, Elahe, Chaves, Juanma Zambrano, Hu, Szu-Yeu, Schaekermann, Mike, Kamath, Aishwarya, Cheng, Yong, Barrett, David G. T., Cheung, Cathy, Mustafa, Basil, Palepu, Anil, McDuff, Daniel, Hou, Le, Golany, Tomer, Liu, Luyang, Alayrac, Jean-baptiste, Houlsby, Neil, Tomasev, Nenad, Freyberg, Jan, Lau, Charles, Kemp, Jonas, Lai, Jeremy, Azizi, Shekoofeh, Kanada, Kimberly, Man, SiWai, Kulkarni, Kavita, Sun, Ruoxi, Shakeri, Siamak, He, Luheng, Caine, Ben, Webson, Albert, Latysheva, Natasha, Johnson, Melvin, Mansfield, Philip, Lu, Jian, Rivlin, Ehud, Anderson, Jesper, Green, Bradley, Wong, Renee, Krause, Jonathan, Shlens, Jonathon, Dominowska, Ewa, Eslami, S. M. Ali, Chou, Katherine, Cui, Claire, Vinyals, Oriol, Kavukcuoglu, Koray, Manyika, James, Dean, Jeff, Hassabis, Demis, Matias, Yossi, Webster, Dale, Barral, Joelle, Corrado, Greg, Semturs, Christopher, Mahdavi, S. Sara, Gottweis, Juraj, Karthikesalingam, Alan, Natarajan, Vivek
Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilit
Externí odkaz:
http://arxiv.org/abs/2404.18416
Autor:
Elhoushi, Mostafa, Shrivastava, Akshat, Liskovich, Diana, Hosmer, Basil, Wasti, Bram, Lai, Liangzhen, Mahmoud, Anas, Acun, Bilge, Agarwal, Saurabh, Roman, Ahmed, Aly, Ahmed A, Chen, Beidi, Wu, Carole-Jean
We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit
Externí odkaz:
http://arxiv.org/abs/2404.16710
We consider stochastic optimization problems with functional constraints. If the objective and constraint functions are not convex, the classical stochastic approximation algorithms such as the proximal stochastic gradient descent do not lead to effi
Externí odkaz:
http://arxiv.org/abs/2404.11790
Publikováno v:
Open Communications in Nonlinear Mathematical Physics, Proceedings: OCNMP Conference, Bad Ems (Germany), 23-29 June 2024 (April 16, 2024) ocnmp:13267
We study the link between the degree growth of integrable birational mappings of order higher than two and their singularity structures. The higher order mappings we use in this study are all obtained by coupling mappings that are integrable through
Externí odkaz:
http://arxiv.org/abs/2403.14329
Autor:
Misiakiewicz, Theodor, Saeed, Basil
We consider learning an unknown target function $f_*$ using kernel ridge regression (KRR) given i.i.d. data $(u_i,y_i)$, $i\leq n$, where $u_i \in U$ is a covariate vector and $y_i = f_* (u_i) +\varepsilon_i \in \mathbb{R}$. A recent string of work h
Externí odkaz:
http://arxiv.org/abs/2403.08938
Autor:
Agarwal, Saurabh, Acun, Bilge, Hosmer, Basil, Elhoushi, Mostafa, Lee, Yejin, Venkataraman, Shivaram, Papailiopoulos, Dimitris, Wu, Carole-Jean
Large Language Models (LLMs) with hundreds of billions of parameters have transformed the field of machine learning. However, serving these models at inference time is both compute and memory intensive, where a single request can require multiple GPU
Externí odkaz:
http://arxiv.org/abs/2403.08058