Zobrazeno 1 - 10
of 32
pro vyhledávání: '"Sehanobish, Arijit"'
Autor:
Sehanobish, Arijit, Dubey, Avinava, Choromanski, Krzysztof, Chowdhury, Somnath Basu Roy, Jain, Deepali, Sindhwani, Vikas, Chaturvedi, Snigdha
Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts. Parameter-efficient
Externí odkaz:
http://arxiv.org/abs/2406.17740
Autor:
Chowdhury, Somnath Basu Roy, Choromanski, Krzysztof, Sehanobish, Arijit, Dubey, Avinava, Chaturvedi, Snigdha
Machine unlearning is the process of efficiently removing the influence of a training data instance from a trained machine learning model without retraining it from scratch. A popular subclass of unlearning approaches is exact machine unlearning, whi
Externí odkaz:
http://arxiv.org/abs/2406.16257
Autor:
Choromanski, Krzysztof, Sehanobish, Arijit, Chowdhury, Somnath Basu Roy, Lin, Han, Dubey, Avinava, Sarlos, Tamas, Chaturvedi, Snigdha
We present a new class of fast polylog-linear algorithms based on the theory of structured matrices (in particular low displacement rank) for integrating tensor fields defined on weighted trees. Several applications of the resulting fast tree-field i
Externí odkaz:
http://arxiv.org/abs/2406.15881
Autor:
Sehanobish, Arijit, Choromanski, Krzysztof, Zhao, Yunfan, Dubey, Avinava, Likhosherstov, Valerii
We introduce the concept of scalable neural network kernels (SNNKs), the replacements of regular feedforward layers (FFLs), capable of approximating the latter, but with favorable computational properties. SNNKs effectively disentangle the inputs fro
Externí odkaz:
http://arxiv.org/abs/2310.13225
Autor:
Choromanski, Krzysztof, Sehanobish, Arijit, Lin, Han, Zhao, Yunfan, Berger, Eli, Parshakova, Tetiana, Pan, Alvin, Watkins, David, Zhang, Tianyi, Likhosherstov, Valerii, Chowdhury, Somnath Basu Roy, Dubey, Avinava, Jain, Deepali, Sarlos, Tamas, Chaturvedi, Snigdha, Weller, Adrian
Publikováno v:
ICML 2023
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds. The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), u
Externí odkaz:
http://arxiv.org/abs/2302.00942
Large pretrained Transformer-based language models like BERT and GPT have changed the landscape of Natural Language Processing (NLP). However, fine tuning such models still requires a large number of training examples for each target task, thus annot
Externí odkaz:
http://arxiv.org/abs/2210.13979
Autor:
Sehanobish, Arijit, Sandora, McCullen, Abraham, Nabila, Pawar, Jayashri, Torres, Danielle, Das, Anasuya, Becker, Murray, Herzog, Richard, Odry, Benjamin, Vianu, Ron
Pretrained Transformer based models finetuned on domain specific corpora have changed the landscape of NLP. However, training or fine-tuning these models for individual tasks can be time consuming and resource intensive. Thus, a lot of current resear
Externí odkaz:
http://arxiv.org/abs/2205.02979
Autor:
Sehanobish, Arijit, Brown, Nathaniel, Daga, Ishita, Pawar, Jayashri, Torres, Danielle, Das, Anasuya, Becker, Murray, Herzog, Richard, Odry, Benjamin, Vianu, Ron
Pretrained Transformer based models finetuned on domain specific corpora have changed the landscape of NLP. Generally, if one has multiple tasks on a given dataset, one may finetune different models or use task specific adapters. In this work, we sho
Externí odkaz:
http://arxiv.org/abs/2204.04544
Autor:
Choromanski, Krzysztof, Chen, Haoxian, Lin, Han, Ma, Yuanzhe, Sehanobish, Arijit, Jain, Deepali, Ryoo, Michael S, Varley, Jake, Zeng, Andy, Likhosherstov, Valerii, Kalashnikov, Dmitry, Sindhwani, Vikas, Weller, Adrian
We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions
Externí odkaz:
http://arxiv.org/abs/2110.04367
Transformers are state-of-the-art deep learning models that are composed of stacked attention and point-wise, fully connected layers designed for handling sequential data. Transformers are not only ubiquitous throughout Natural Language Processing (N
Externí odkaz:
http://arxiv.org/abs/2109.13925