Výsledky vyhledávání

Report

Learning to Compress: Local Rank and Information Compression in Deep Neural Networks

Deep neural networks tend to exhibit a bias toward low-rank solutions during training, implicitly learning low-dimensional feature representations. This paper investigates how deep multilayer perceptrons (MLPs) encode these feature manifolds and conn

Externí odkaz: http://arxiv.org/abs/2410.07687

Zobrazit plný text záznamu

Report

Denoising Variational Autoencoder as a Feature Reduction Pipeline for the diagnosis of Autism based on Resting-state fMRI

Autor: Zheng, Xinyuan, Ravid, Orren, Barry, Robert A. J., Kim, Yoojean, Wang, Qian, Kim, Young-geun, Zhu, Xi, He, Xiaofu

Autism spectrum disorders (ASDs) are developmental conditions characterized by restricted interests and difficulties in communication. The complexity of ASD has resulted in a deficiency of objective diagnostic biomarkers. Deep learning methods have g

Externí odkaz: http://arxiv.org/abs/2410.00068

Zobrazit plný text záznamu

Report

Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs

Autor: Nguyen, Minh, Baker, Andrew, Neo, Clement, Roush, Allen, Kirsch, Andreas, Shwartz-Ziv, Ravid

Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. However, popular sampling methods like top-p (nucleus sampling) often struggle to balance quality and div

Externí odkaz: http://arxiv.org/abs/2407.01082

Zobrazit plný text záznamu

Report

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Autor: White, Colin, Dooley, Samuel, Roberts, Manley, Pal, Arka, Feuer, Ben, Jain, Siddhartha, Shwartz-Ziv, Ravid, Jain, Neel, Saifullah, Khalid, Naidu, Siddartha, Hegde, Chinmay, LeCun, Yann, Goldstein, Tom, Neiswanger, Willie, Goldblum, Micah

Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource

Externí odkaz: http://arxiv.org/abs/2406.19314

Zobrazit plný text záznamu

Report

OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset

Autor: Roush, Allen, Shabazz, Yusuf, Balaji, Arvind, Zhang, Peter, Mezza, Stefano, Zhang, Markus, Basu, Sanjay, Vishwanath, Sriram, Fatemi, Mehdi, Shwartz-Ziv, Ravid

We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most exte

Externí odkaz: http://arxiv.org/abs/2406.14657

Zobrazit plný text záznamu

Report

Just How Flexible are Neural Networks in Practice?

Autor: Shwartz-Ziv, Ravid, Goldblum, Micah, Bansal, Arpit, Bruss, C. Bayan, LeCun, Yann, Wilson, Andrew Gordon

It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessi

Externí odkaz: http://arxiv.org/abs/2406.11463

Zobrazit plný text záznamu

Report

Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

Autor: Schaeffer, Rylan, Lecomte, Victor, Pai, Dhruv Bhandarkar, Carranza, Andres, Isik, Berivan, Unell, Alyssa, Khona, Mikail, Yerxa, Thomas, LeCun, Yann, Chung, SueYeon, Gromov, Andrey, Shwartz-Ziv, Ravid, Koyejo, Sanmi

Maximum Manifold Capacity Representations (MMCR) is a recent multi-view self-supervised learning (MVSSL) method that matches or surpasses other leading MVSSL methods. MMCR is intriguing because it does not fit neatly into any of the commonplace MVSSL

Externí odkaz: http://arxiv.org/abs/2406.09366

Zobrazit plný text záznamu

Report

The Entropy Enigma: Success and Failure of Entropy Minimization

Autor: Press, Ori, Shwartz-Ziv, Ravid, LeCun, Yann, Bethge, Matthias

Entropy minimization (EM) is frequently used to increase the accuracy of classification models when they're faced with new data at test time. EM is a self-supervised learning method that optimizes classifiers to assign even higher probabilities to th

Externí odkaz: http://arxiv.org/abs/2405.05012

Zobrazit plný text záznamu

Report

Inheritune: Training Smaller Yet More Attentive Language Models

Autor: Sanyal, Sunny, Shwartz-Ziv, Ravid, Dimakis, Alexandros G., Sanghavi, Sujay

Large Language Models (LLMs) have achieved remarkable performance across various natural language processing tasks, primarily due to the transformer architecture and its self-attention mechanism. However, we observe that in standard decoder-style LLM

Externí odkaz: http://arxiv.org/abs/2404.08634

Zobrazit plný text záznamu

Report

Numerical studies of triangulated vesicles with anisotropic membrane inclusions

Autor: Ravid, Yoav, Penič, Samo, Gov, Nir, Kralj-Iglič, Veronika, Iglič, Aleš, Drab, Mitja

In this study, we implement the deviatoric curvature model to examine dynamically triangulated surfaces with anisotropic membrane inclusions. The Monte-Carlo numerical scheme is devised to not only minimize the total bending energy of the membrane bu

Externí odkaz: http://arxiv.org/abs/2403.02885

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání