Zobrazeno 1 - 10
of 4 335
pro vyhledávání: '"A. Kesselheim"'
Data pruning is the problem of identifying a core subset that is most beneficial to training and discarding the remainder. While pruning strategies are well studied for discriminative models like those used in classification, little research has gone
Externí odkaz:
http://arxiv.org/abs/2411.12523
Autor:
Flöge, Klemens, Udayakumar, Srisruthi, Sommer, Johanna, Piraud, Marie, Kesselheim, Stefan, Fortuin, Vincent, Günneman, Stephan, van der Weg, Karel J, Gohlke, Holger, Bazarova, Alina, Merdivan, Erinc
Recent AI advances have enabled multi-modal systems to model and translate diverse information spaces. Extending beyond text and vision, we introduce OneProt, a multi-modal AI for proteins that integrates structural, sequence, alignment, and binding
Externí odkaz:
http://arxiv.org/abs/2411.04863
One of the main challenges in optimal scaling of large language models (LLMs) is the prohibitive cost of hyperparameter tuning, particularly learning rate $\eta$ and batch size $B$. While techniques like $\mu$P (Yang et al., 2022) provide scaling rul
Externí odkaz:
http://arxiv.org/abs/2410.05838
Autor:
Ali, Mehdi, Fromm, Michael, Thellmann, Klaudia, Ebert, Jan, Weber, Alexander Arno, Rutmann, Richard, Jain, Charvi, Lübbering, Max, Steinigen, Daniel, Leveling, Johannes, Klug, Katrin, Buschhoff, Jasper Schulze, Jurkschat, Lena, Abdelwahab, Hammam, Stein, Benny Jörg, Sylla, Karl-Heinz, Denisov, Pavel, Brandizzi, Nicolo', Saleem, Qasid, Bhowmick, Anirban, Helmer, Lennard, John, Chelsea, Suarez, Pedro Ortiz, Ostendorff, Malte, Jude, Alex, Manjunath, Lalith, Weinbach, Samuel, Penke, Carolin, Filatov, Oleg, Asaadi, Shima, Barth, Fabio, Sifa, Rafet, Küch, Fabian, Herten, Andreas, Jäkel, René, Rehm, Georg, Kesselheim, Stefan, Köhler, Joachim, Flores-Herr, Nicolas
We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenize
Externí odkaz:
http://arxiv.org/abs/2410.03730
In online combinatorial allocations/auctions, n bidders sequentially arrive, each with a combinatorial valuation (such as submodular/XOS) over subsets of m indivisible items. The aim is to immediately allocate a subset of the remaining items to maxim
Externí odkaz:
http://arxiv.org/abs/2409.11091
Many classical problems in theoretical computer science involve norm, even if implicitly; for example, both XOS functions and downward-closed sets are equivalent to some norms. The last decade has seen a lot of interest in designing algorithms beyond
Externí odkaz:
http://arxiv.org/abs/2406.15180
We study online capacitated resource allocation, a natural generalization of online stochastic max-weight bipartite matching. This problem is motivated by ride-sharing and Internet advertising applications, where online arrivals may have the capacity
Externí odkaz:
http://arxiv.org/abs/2406.07757
Selling a single item to $n$ self-interested buyers is a fundamental problem in economics, where the two objectives typically considered are welfare maximization and revenue maximization. Since the optimal mechanisms are often impractical and do not
Externí odkaz:
http://arxiv.org/abs/2406.00819
Combinatorial contracts are emerging as a key paradigm in algorithmic contract design, paralleling the role of combinatorial auctions in algorithmic mechanism design. In this paper we study natural combinatorial contract settings involving teams of a
Externí odkaz:
http://arxiv.org/abs/2405.08260
Autor:
Ali, Mehdi, Fromm, Michael, Thellmann, Klaudia, Rutmann, Richard, Lübbering, Max, Leveling, Johannes, Klug, Katrin, Ebert, Jan, Doll, Niclas, Buschhoff, Jasper Schulze, Jain, Charvi, Weber, Alexander Arno, Jurkschat, Lena, Abdelwahab, Hammam, John, Chelsea, Suarez, Pedro Ortiz, Ostendorff, Malte, Weinbach, Samuel, Sifa, Rafet, Kesselheim, Stefan, Flores-Herr, Nicolas
The recent success of Large Language Models (LLMs) has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as
Externí odkaz:
http://arxiv.org/abs/2310.08754