Zobrazeno 1 - 10
of 296
pro vyhledávání: '"Hardt, Moritz"'
Autor:
Dominguez-Olmedo, Ricardo, Nanda, Vedant, Abebe, Rediet, Bechtold, Stefan, Engel, Christoph, Frankenreiter, Jens, Gummadi, Krishna, Hardt, Moritz, Livermore, Michael
Annotation and classification of legal text are central components of empirical legal research. Traditionally, these tasks are often delegated to trained research assistants. Motivated by the advances in language modeling, empirical legal scholars ar
Externí odkaz:
http://arxiv.org/abs/2407.16615
Current question-answering benchmarks predominantly focus on accuracy in realizable prediction tasks. Conditioned on a question and answer-key, does the most likely token match the ground truth? Such benchmarks necessarily fail to evaluate language m
Externí odkaz:
http://arxiv.org/abs/2407.14614
We study a fundamental problem in the evaluation of large language models that we call training on the test task. Unlike wrongful practices like training on the test data, leakage, or data contamination, training on the test task is not a malpractice
Externí odkaz:
http://arxiv.org/abs/2407.07890
We study the predictability of online speech on social media, and whether predictability improves with information outside a user's own posts. Recent work suggests that the predictive information contained in posts written by a user's peers can surpa
Externí odkaz:
http://arxiv.org/abs/2407.12850
Algorithmic predictions are emerging as a promising solution concept for efficiently allocating societal resources. Fueling their use is an underlying assumption that such systems are necessary to identify individuals for interventions. We propose a
Externí odkaz:
http://arxiv.org/abs/2406.13882
Many applications of RCTs involve the presence of multiple treatment administrators -- from field experiments to online advertising -- that compete for the subjects' attention. In the face of competition, estimating a causal effect becomes difficult,
Externí odkaz:
http://arxiv.org/abs/2406.03422
The power of digital platforms is at the center of major ongoing policy and regulatory efforts. To advance existing debates, we designed and executed an experiment to measure the power of online search providers, building on the recent definition of
Externí odkaz:
http://arxiv.org/abs/2405.19073
Autor:
Zhang, Guanhua, Hardt, Moritz
We examine multi-task benchmarks in machine learning through the lens of social choice theory. We draw an analogy between benchmarks and electoral systems, where models are candidates and tasks are voters. This suggests a distinction between cardinal
Externí odkaz:
http://arxiv.org/abs/2405.01719
Autor:
Salaudeen, Olawale, Hardt, Moritz
We introduce ImageNot, a dataset designed to match the scale of ImageNet while differing drastically in other aspects. We show that key model architectures developed for ImageNet over the years rank identically when trained and evaluated on ImageNot
Externí odkaz:
http://arxiv.org/abs/2404.02112
Autor:
Nastl, Vivian Y., Hardt, Moritz
We study how well machine learning models trained on causal features generalize across domains. We consider 16 prediction tasks on tabular datasets covering applications in health, employment, education, social benefits, and politics. Each dataset co
Externí odkaz:
http://arxiv.org/abs/2402.09891