Výsledky vyhledávání - "Horton, Maxwell"

Report

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Autor: Mehta, Sachin, Horton, Maxwell, Faghri, Fartash, Sekhavat, Mohammad Hossein, Najibi, Mahyar, Farajtabar, Mehrdad, Tuzel, Oncel, Rastegari, Mohammad

Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text embeddings. However, pairwise similarity computation in contrastive loss between image and text pairs po

Externí odkaz: http://arxiv.org/abs/2404.15653

Zobrazit plný text záznamu

Report

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

Autor: Mehta, Sachin, Sekhavat, Mohammad Hossein, Cao, Qingqing, Horton, Maxwell, Jin, Yanzi, Sun, Chenfan, Mirzadeh, Iman, Najibi, Mahyar, Belenko, Dmitry, Zatloukal, Peter, Rastegari, Mohammad

The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end, we releas

Externí odkaz: http://arxiv.org/abs/2404.14619

Zobrazit plný text záznamu

Report

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

Autor: Salehi, Mohammadreza, Farajtabar, Mehrdad, Horton, Maxwell, Faghri, Fartash, Pouransari, Hadi, Vemulapalli, Raviteja, Tuzel, Oncel, Farhadi, Ali, Rastegari, Mohammad, Mehta, Sachin

Contrastive language image pretraining (CLIP) is a standard method for training vision-language models. While CLIP is scalable, promptable, and robust to distribution shifts on image classification tasks, it lacks object localization capabilities. Th

Externí odkaz: http://arxiv.org/abs/2310.14108

Zobrazit plný text záznamu

Report

Diffusion Models as Masked Audio-Video Learners

Autor: Nunez, Elvis, Jin, Yanzi, Rastegari, Mohammad, Mehta, Sachin, Horton, Maxwell

Over the past several years, the synchronization between audio and visual signals has been leveraged to learn richer audio-visual representations. Aided by the large availability of unlabeled videos, many unsupervised training frameworks have demonst

Externí odkaz: http://arxiv.org/abs/2310.03937

Zobrazit plný text záznamu

Report

On the Efficacy of Multi-scale Data Samplers for Vision Applications

Autor: Nunez, Elvis, Merth, Thomas, Prabhu, Anish, Farajtabar, Mehrdad, Rastegari, Mohammad, Mehta, Sachin, Horton, Maxwell

Multi-scale resolution training has seen an increased adoption across multiple vision tasks, including classification and detection. Training with smaller resolutions enables faster training at the expense of a drop in accuracy. Conversely, training

Externí odkaz: http://arxiv.org/abs/2309.04502

Zobrazit plný text záznamu

Report

Bytes Are All You Need: Transformers Operating Directly On File Bytes

Autor: Horton, Maxwell, Mehta, Sachin, Farhadi, Ali, Rastegari, Mohammad

Publikováno v: Transactions on Machine Learning Research 2835-8856 (2024)

Modern deep learning approaches usually utilize modality-specific processing. For example, the most common deep learning approach to image classification involves decoding image file bytes into an RGB tensor which is passed into a neural network. Ins

Externí odkaz: http://arxiv.org/abs/2306.00238

Zobrazit plný text záznamu

Report

RangeAugment: Efficient Online Augmentation with Range Learning

Autor: Mehta, Sachin, Naderiparizi, Saeid, Faghri, Fartash, Horton, Maxwell, Chen, Lailin, Farhadi, Ali, Tuzel, Oncel, Rastegari, Mohammad

State-of-the-art automatic augmentation methods (e.g., AutoAugment and RandAugment) for visual recognition tasks diversify training data using a large set of augmentation operations. The range of magnitudes of many augmentation operations (e.g., brig

Externí odkaz: http://arxiv.org/abs/2212.10553

Zobrazit plný text záznamu

Report

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

Autor: Lin, Chien-Yu, Prabhu, Anish, Merth, Thomas, Mehta, Sachin, Ranjan, Anurag, Horton, Maxwell, Rastegari, Mohammad

Recent isotropic networks, such as ConvMixer and vision transformers, have found significant success across visual recognition tasks, matching or outperforming non-isotropic convolutional neural networks (CNNs). Isotropic architectures are particular

Externí odkaz: http://arxiv.org/abs/2207.10237

Zobrazit plný text záznamu

Report

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

Autor: Nunez, Elvis, Horton, Maxwell, Prabhu, Anish, Ranjan, Anurag, Farhadi, Ali, Rastegari, Mohammad

When deploying deep learning models to a device, it is traditionally assumed that available computational resources (compute, memory, and power) remain static. However, real-world computing systems do not always provide stable resource guarantees. Co

Externí odkaz: http://arxiv.org/abs/2110.04252

Zobrazit plný text záznamu

Report

Learning Neural Network Subspaces

Autor: Wortsman, Mitchell, Horton, Maxwell, Guestrin, Carlos, Farhadi, Ali, Rastegari, Mohammad

Recent observations have advanced our understanding of the neural network optimization landscape, revealing the existence of (1) paths of high accuracy containing diverse solutions and (2) wider minima offering improved performance. Previous methods

Externí odkaz: http://arxiv.org/abs/2102.10472

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání