Výsledky vyhledávání

Report

Numerical Literals in Link Prediction: A Critical Examination of Models and Datasets

Autor: Blum, Moritz, Ell, Basil, Ill, Hannes, Cimiano, Philipp

Link Prediction(LP) is an essential task over Knowledge Graphs(KGs), traditionally focussed on using and predicting the relations between entities. Textual entity descriptions have already been shown to be valuable, but models that incorporate numeri

Externí odkaz: http://arxiv.org/abs/2407.18241

Zobrazit plný text záznamu

Report

Countermeasures Against Adversarial Examples in Radio Signal Classification

Autor: Zhang, Lu, Lambotharan, Sangarapillai, Zheng, Gan, AsSadhan, Basil, Roli, Fabio

Deep learning algorithms have been shown to be powerful in many communication network design problems, including that in automatic modulation classification. However, they are vulnerable to carefully crafted attacks called adversarial examples. Hence

Externí odkaz: http://arxiv.org/abs/2407.06796

Zobrazit plný text záznamu

Report

Analysis of Decentralized Stochastic Successive Convex Approximation for composite non-convex problems

Autor: Idrees, Basil M., Sharma, Shivangi Dubey, Rajawat, Ketan

This work considers the decentralized successive convex approximation (SCA) method for minimizing stochastic non-convex objectives subject to convex constraints, along with possibly non-smooth convex regularizers. Although SCA has been widely applied

Externí odkaz: http://arxiv.org/abs/2405.07100

Zobrazit plný text záznamu

Report

Is Flash Attention Stable?

Autor: Golden, Alicia, Hsia, Samuel, Sun, Fei, Acun, Bilge, Hosmer, Basil, Lee, Yejin, DeVito, Zachary, Johnson, Jeff, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability dur

Externí odkaz: http://arxiv.org/abs/2405.02803

Zobrazit plný text záznamu

Report

Capabilities of Gemini Models in Medicine

Autor: Saab, Khaled, Tu, Tao, Weng, Wei-Hung, Tanno, Ryutaro, Stutz, David, Wulczyn, Ellery, Zhang, Fan, Strother, Tim, Park, Chunjong, Vedadi, Elahe, Chaves, Juanma Zambrano, Hu, Szu-Yeu, Schaekermann, Mike, Kamath, Aishwarya, Cheng, Yong, Barrett, David G. T., Cheung, Cathy, Mustafa, Basil, Palepu, Anil, McDuff, Daniel, Hou, Le, Golany, Tomer, Liu, Luyang, Alayrac, Jean-baptiste, Houlsby, Neil, Tomasev, Nenad, Freyberg, Jan, Lau, Charles, Kemp, Jonas, Lai, Jeremy, Azizi, Shekoofeh, Kanada, Kimberly, Man, SiWai, Kulkarni, Kavita, Sun, Ruoxi, Shakeri, Siamak, He, Luheng, Caine, Ben, Webson, Albert, Latysheva, Natasha, Johnson, Melvin, Mansfield, Philip, Lu, Jian, Rivlin, Ehud, Anderson, Jesper, Green, Bradley, Wong, Renee, Krause, Jonathan, Shlens, Jonathon, Dominowska, Ewa, Eslami, S. M. Ali, Chou, Katherine, Cui, Claire, Vinyals, Oriol, Kavukcuoglu, Koray, Manyika, James, Dean, Jeff, Hassabis, Demis, Matias, Yossi, Webster, Dale, Barral, Joelle, Corrado, Greg, Semturs, Christopher, Mahdavi, S. Sara, Gottweis, Juraj, Karthikesalingam, Alan, Natarajan, Vivek

Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilit

Externí odkaz: http://arxiv.org/abs/2404.18416

Zobrazit plný text záznamu

Report

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Autor: Elhoushi, Mostafa, Shrivastava, Akshat, Liskovich, Diana, Hosmer, Basil, Wasti, Bram, Lai, Liangzhen, Mahmoud, Anas, Acun, Bilge, Agarwal, Saurabh, Roman, Ahmed, Aly, Ahmed A, Chen, Beidi, Wu, Carole-Jean

We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit

Externí odkaz: http://arxiv.org/abs/2404.16710

Zobrazit plný text záznamu

Report

Constrained Stochastic Recursive Momentum Successive Convex Approximation

Autor: Idrees, Basil M., Arora, Lavish, Rajawat, Ketan

We consider stochastic optimization problems with functional constraints. If the objective and constraint functions are not convex, the classical stochastic approximation algorithms such as the proximal stochastic gradient descent do not lead to effi

Externí odkaz: http://arxiv.org/abs/2404.11790

Zobrazit plný text záznamu

Report

Singularities and growth of higher order discrete equations

Autor: Willox, Ralph, Mase, Takafumi, Ramani, Alfred, Grammaticos, Basil

Publikováno v: Open Communications in Nonlinear Mathematical Physics, Proceedings: OCNMP Conference, Bad Ems (Germany), 23-29 June 2024 (April 16, 2024) ocnmp:13267

We study the link between the degree growth of integrable birational mappings of order higher than two and their singularity structures. The higher order mappings we use in this study are all obtained by coupling mappings that are integrable through

Externí odkaz: http://arxiv.org/abs/2403.14329

Zobrazit plný text záznamu

Report

A non-asymptotic theory of Kernel Ridge Regression: deterministic equivalents, test error, and GCV estimator

Autor: Misiakiewicz, Theodor, Saeed, Basil

We consider learning an unknown target function $f_*$ using kernel ridge regression (KRR) given i.i.d. data $(u_i,y_i)$, $i\leq n$, where $u_i \in U$ is a covariate vector and $y_i = f_* (u_i) +\varepsilon_i \in \mathbb{R}$. A recent string of work h

Externí odkaz: http://arxiv.org/abs/2403.08938

Zobrazit plný text záznamu

Report

CHAI: Clustered Head Attention for Efficient LLM Inference

Autor: Agarwal, Saurabh, Acun, Bilge, Hosmer, Basil, Elhoushi, Mostafa, Lee, Yejin, Venkataraman, Shivaram, Papailiopoulos, Dimitris, Wu, Carole-Jean

Large Language Models (LLMs) with hundreds of billions of parameters have transformed the field of machine learning. However, serving these models at inference time is both compute and memory intensive, where a single request can require multiple GPU

Externí odkaz: http://arxiv.org/abs/2403.08058

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání