Výsledky vyhledávání - "Shwartz-Ziv, Ravid"

Report

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Autor: White, Colin, Dooley, Samuel, Roberts, Manley, Pal, Arka, Feuer, Ben, Jain, Siddhartha, Shwartz-Ziv, Ravid, Jain, Neel, Saifullah, Khalid, Naidu, Siddartha, Hegde, Chinmay, LeCun, Yann, Goldstein, Tom, Neiswanger, Willie, Goldblum, Micah

Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource

Externí odkaz: http://arxiv.org/abs/2406.19314

Zobrazit plný text záznamu

Report

OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset

Autor: Roush, Allen, Shabazz, Yusuf, Balaji, Arvind, Zhang, Peter, Mezza, Stefano, Zhang, Markus, Basu, Sanjay, Vishwanath, Sriram, Fatemi, Mehdi, Shwartz-Ziv, Ravid

We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most exte

Externí odkaz: http://arxiv.org/abs/2406.14657

Zobrazit plný text záznamu

Report

Just How Flexible are Neural Networks in Practice?

Autor: Shwartz-Ziv, Ravid, Goldblum, Micah, Bansal, Arpit, Bruss, C. Bayan, LeCun, Yann, Wilson, Andrew Gordon

It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessi

Externí odkaz: http://arxiv.org/abs/2406.11463

Zobrazit plný text záznamu

Report

Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

Autor: Schaeffer, Rylan, Lecomte, Victor, Pai, Dhruv Bhandarkar, Carranza, Andres, Isik, Berivan, Unell, Alyssa, Khona, Mikail, Yerxa, Thomas, LeCun, Yann, Chung, SueYeon, Gromov, Andrey, Shwartz-Ziv, Ravid, Koyejo, Sanmi

Maximum Manifold Capacity Representations (MMCR) is a recent multi-view self-supervised learning (MVSSL) method that matches or surpasses other leading MVSSL methods. MMCR is intriguing because it does not fit neatly into any of the commonplace MVSSL

Externí odkaz: http://arxiv.org/abs/2406.09366

Zobrazit plný text záznamu

Report

The Entropy Enigma: Success and Failure of Entropy Minimization

Autor: Press, Ori, Shwartz-Ziv, Ravid, LeCun, Yann, Bethge, Matthias

Entropy minimization (EM) is frequently used to increase the accuracy of classification models when they're faced with new data at test time. EM is a self-supervised learning method that optimizes classifiers to assign even higher probabilities to th

Externí odkaz: http://arxiv.org/abs/2405.05012

Zobrazit plný text záznamu

Report

Simplifying Neural Network Training Under Class Imbalance

Autor: Shwartz-Ziv, Ravid, Goldblum, Micah, Li, Yucen Lily, Bruss, C. Bayan, Wilson, Andrew Gordon

Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models. The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling

Externí odkaz: http://arxiv.org/abs/2312.02517

Zobrazit plný text záznamu

Report

Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

Autor: Chen, Angelica, Shwartz-Ziv, Ravid, Cho, Kyunghyun, Leavitt, Matthew L., Saphra, Naomi

Most interpretability research in NLP focuses on understanding the behavior and features of a fully trained model. However, certain insights into model behavior may only be accessible by observing the trajectory of the training process. We present a

Externí odkaz: http://arxiv.org/abs/2309.07311

Zobrazit plný text záznamu

Report

Variance-Covariance Regularization Improves Representation Learning

Autor: Zhu, Jiachen, Evtimova, Katrina, Chen, Yubei, Shwartz-Ziv, Ravid, LeCun, Yann

Transfer learning plays a key role in advancing machine learning models, yet conventional supervised pretraining often undermines feature transferability by prioritizing features that minimize the pretraining loss. In this work, we adapt a self-super

Externí odkaz: http://arxiv.org/abs/2306.13292

Zobrazit plný text záznamu

Report

Reverse Engineering Self-Supervised Learning

Autor: Ben-Shaul, Ido, Shwartz-Ziv, Ravid, Galanti, Tomer, Dekel, Shai, LeCun, Yann

Self-supervised learning (SSL) is a powerful tool in machine learning, but understanding the learned representations and their underlying mechanisms remains a challenge. This paper presents an in-depth empirical analysis of SSL-trained representation

Externí odkaz: http://arxiv.org/abs/2305.15614

Zobrazit plný text záznamu

Report

To Compress or Not to Compress- Self-Supervised Learning and Information Theory: A Review

Autor: Shwartz-Ziv, Ravid, LeCun, Yann

Deep neural networks excel in supervised learning tasks but are constrained by the need for extensive labeled data. Self-supervised learning emerges as a promising alternative, allowing models to learn without explicit labels. Information theory, and

Externí odkaz: http://arxiv.org/abs/2304.09355

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání