Výsledky vyhledávání

Report

A look at generalized trigonometric functions as functions of their two parameters and further new properties

Investigation of the generalized trigonometric and hyperbolic functions containing two parameters has been a very active research area over the last decade. We believe, however, that their monotonicity and convexity properties with respect to paramet

Externí odkaz: http://arxiv.org/abs/2411.13442

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

On the Inductive Bias of Stacking Towards Improving Reasoning

Autor: Saunshi, Nikunj, Karp, Stefani, Krishnan, Shankar, Miryoosefi, Sobhan, Reddi, Sashank J., Kumar, Sanjiv

Given the increasing scale of model sizes, novel training strategies like gradual stacking [Gong et al., 2019, Reddi et al., 2023] have garnered interest. Stacking enables efficient training by gradually growing the depth of a model in stages and usi

Externí odkaz: http://arxiv.org/abs/2409.19044

Zobrazit plný text záznamu

Report

Unimodality preservation by ratios of functional series and integral transforms

Autor: Karp, Dmitrii, Vishnyakova, Anna, Zhang, Yi

Elementary, but very useful lemma due to Biernacki and Krzy\.{z} (1955) asserts that the ratio of two power series inherits monotonicity from that of the sequence of ratios of their corresponding coefficients. Over the last two decades it has been re

Externí odkaz: http://arxiv.org/abs/2408.01755

Zobrazit plný text záznamu

Report

Beyond Throughput and Compression Ratios: Towards High End-to-end Utility of Gradient Compression

Autor: Han, Wenchen, Vargaftik, Shay, Mitzenmacher, Michael, Karp, Brad, Basat, Ran Ben

Gradient aggregation has long been identified as a major bottleneck in today's large-scale distributed machine learning training systems. One promising solution to mitigate such bottlenecks is gradient compression, directly reducing communicated grad

Externí odkaz: http://arxiv.org/abs/2407.01378

Zobrazit plný text záznamu

Report

Development and Validation of a Deep-Learning Model for Differential Treatment Benefit Prediction for Adults with Major Depressive Disorder Deployed in the Artificial Intelligence in Depression Medication Enhancement (AIDME) Study

Autor: Benrimoh, David, Armstrong, Caitrin, Mehltretter, Joseph, Fratila, Robert, Perlman, Kelly, Israel, Sonia, Kapelner, Adam, Parikh, Sagar V., Karp, Jordan F., Heller, Katherine, Turecki, Gustavo

INTRODUCTION: The pharmacological treatment of Major Depressive Disorder (MDD) relies on a trial-and-error approach. We introduce an artificial intelligence (AI) model aiming to personalize treatment and improve outcomes, which was deployed in the Ar

Externí odkaz: http://arxiv.org/abs/2406.04993

Zobrazit plný text záznamu

Report

Landscape-Aware Growing: The Power of a Little LAG

Autor: Karp, Stefani, Saunshi, Nikunj, Miryoosefi, Sobhan, Reddi, Sashank J., Kumar, Sanjiv

Recently, there has been increasing interest in efficient pretraining paradigms for training Transformer-based models. Several recent approaches use smaller models to initialize larger models in order to save computation (e.g., stacking and fusion).

Externí odkaz: http://arxiv.org/abs/2406.02469

Zobrazit plný text záznamu

Report

Positivity and universal Pl\'ucker coordinates for spaces of quasi-exponentials

Autor: Karp, Steven N., Mukhin, Evgeny, Tarasov, Vitaly

A quasi-exponential is an entire function of the form $e^{cu}p(u)$, where $p(u)$ is a polynomial and $c \in \mathbb{C}$. Let $V = \langle e^{h_1u}p_1(u), \dots, e^{h_Nu}p_N(u) \rangle$ be a vector space with a basis of quasi-exponentials. We show tha

Externí odkaz: http://arxiv.org/abs/2405.20229

Zobrazit plný text záznamu

Report

Experience and Analysis of Scalable High-Fidelity Computational Fluid Dynamics on Modular Supercomputing Architectures

Autor: Karp, Martin, Suarez, Estela, Meinke, Jan H., Andersson, Måns I., Schlatter, Philipp, Markidis, Stefano, Jansson, Niclas

The never-ending computational demand from simulations of turbulence makes computational fluid dynamics (CFD) a prime application use case for current and future exascale systems. High-order finite element methods, such as the spectral element method

Externí odkaz: http://arxiv.org/abs/2405.05640

Zobrazit plný text záznamu

Report

Supercomputers as a Continous Medium

Autor: Karp, Martin, Jansson, Niclas, Schlatter, Philipp, Markidis, Stefano

As supercomputers' complexity has grown, the traditional boundaries between processor, memory, network, and accelerators have blurred, making a homogeneous computer model, in which the overall computer system is modeled as a continuous medium with ho

Externí odkaz: http://arxiv.org/abs/2405.05639

Zobrazit plný text záznamu

Report

Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs

Autor: Lahoti, Aakash, Karp, Stefani, Winston, Ezra, Singh, Aarti, Li, Yuanzhi

Vision tasks are characterized by the properties of locality and translation invariance. The superior performance of convolutional neural networks (CNNs) on these tasks is widely attributed to the inductive bias of locality and weight sharing baked i

Externí odkaz: http://arxiv.org/abs/2403.15707

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání