Zobrazeno 1 - 10
of 7 308
pro vyhledávání: '"Ravid A"'
Autor:
Patel, Niket, Shwartz-Ziv, Ravid
Deep neural networks tend to exhibit a bias toward low-rank solutions during training, implicitly learning low-dimensional feature representations. This paper investigates how deep multilayer perceptrons (MLPs) encode these feature manifolds and conn
Externí odkaz:
http://arxiv.org/abs/2410.07687
Autor:
Zheng, Xinyuan, Ravid, Orren, Barry, Robert A. J., Kim, Yoojean, Wang, Qian, Kim, Young-geun, Zhu, Xi, He, Xiaofu
Autism spectrum disorders (ASDs) are developmental conditions characterized by restricted interests and difficulties in communication. The complexity of ASD has resulted in a deficiency of objective diagnostic biomarkers. Deep learning methods have g
Externí odkaz:
http://arxiv.org/abs/2410.00068
Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. However, popular sampling methods like top-p (nucleus sampling) often struggle to balance quality and div
Externí odkaz:
http://arxiv.org/abs/2407.01082
Autor:
White, Colin, Dooley, Samuel, Roberts, Manley, Pal, Arka, Feuer, Ben, Jain, Siddhartha, Shwartz-Ziv, Ravid, Jain, Neel, Saifullah, Khalid, Naidu, Siddartha, Hegde, Chinmay, LeCun, Yann, Goldstein, Tom, Neiswanger, Willie, Goldblum, Micah
Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource
Externí odkaz:
http://arxiv.org/abs/2406.19314
Autor:
Roush, Allen, Shabazz, Yusuf, Balaji, Arvind, Zhang, Peter, Mezza, Stefano, Zhang, Markus, Basu, Sanjay, Vishwanath, Sriram, Fatemi, Mehdi, Shwartz-Ziv, Ravid
We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most exte
Externí odkaz:
http://arxiv.org/abs/2406.14657
Autor:
Shwartz-Ziv, Ravid, Goldblum, Micah, Bansal, Arpit, Bruss, C. Bayan, LeCun, Yann, Wilson, Andrew Gordon
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessi
Externí odkaz:
http://arxiv.org/abs/2406.11463
Autor:
Schaeffer, Rylan, Lecomte, Victor, Pai, Dhruv Bhandarkar, Carranza, Andres, Isik, Berivan, Unell, Alyssa, Khona, Mikail, Yerxa, Thomas, LeCun, Yann, Chung, SueYeon, Gromov, Andrey, Shwartz-Ziv, Ravid, Koyejo, Sanmi
Maximum Manifold Capacity Representations (MMCR) is a recent multi-view self-supervised learning (MVSSL) method that matches or surpasses other leading MVSSL methods. MMCR is intriguing because it does not fit neatly into any of the commonplace MVSSL
Externí odkaz:
http://arxiv.org/abs/2406.09366
Entropy minimization (EM) is frequently used to increase the accuracy of classification models when they're faced with new data at test time. EM is a self-supervised learning method that optimizes classifiers to assign even higher probabilities to th
Externí odkaz:
http://arxiv.org/abs/2405.05012
Large Language Models (LLMs) have achieved remarkable performance across various natural language processing tasks, primarily due to the transformer architecture and its self-attention mechanism. However, we observe that in standard decoder-style LLM
Externí odkaz:
http://arxiv.org/abs/2404.08634
In this study, we implement the deviatoric curvature model to examine dynamically triangulated surfaces with anisotropic membrane inclusions. The Monte-Carlo numerical scheme is devised to not only minimize the total bending energy of the membrane bu
Externí odkaz:
http://arxiv.org/abs/2403.02885