Zobrazeno 1 - 10
of 2 972
pro vyhledávání: '"Jaggi P"'
Learning Rate Warmup is a popular heuristic for training neural networks, especially at larger batch sizes, despite limited understanding of its benefits. Warmup decreases the update size $\Delta \mathbf{w}_t = \eta_t \mathbf{u}_t$ early in training
Externí odkaz:
http://arxiv.org/abs/2410.23922
We study stochastic second-order methods for solving general non-convex optimization problems. We propose using a special version of momentum to stabilize the stochastic gradient and Hessian estimates in Newton's method. We show that momentum provabl
Externí odkaz:
http://arxiv.org/abs/2410.19644
When annotators disagree, predicting the labels given by individual annotators can capture nuances overlooked by traditional label aggregation. We introduce three approaches to predicting individual annotator ratings on the toxicity of text by incorp
Externí odkaz:
http://arxiv.org/abs/2410.12217
Influence functions provide a principled method to assess the contribution of individual training samples to a specific target. Yet, their high computational costs limit their applications on large-scale models and datasets. Existing methods proposed
Externí odkaz:
http://arxiv.org/abs/2410.05090
Autor:
Pandey, Himanshu, Amod, Akhil, Shivang, Jaggi, Kshitij, Garg, Ruchi, Jain, Abheet, Tantia, Vinayak
Artificial Intelligence (AI) and Large Language Models (LLMs) hold significant promise in revolutionizing healthcare, especially in clinical applications. Simultaneously, Digital Twin technology, which models and simulates complex systems, has gained
Externí odkaz:
http://arxiv.org/abs/2409.17650
On-device LLMs have gained increasing attention for their ability to enhance privacy and provide a personalized user experience. To facilitate learning with private and scarce local data, federated learning has become a standard approach, though it i
Externí odkaz:
http://arxiv.org/abs/2409.13931
Collaborative learning is an important tool to train multiple clients more effectively by enabling communication among clients. Identifying helpful clients, however, presents challenging and often introduces significant overhead. In this paper, we mo
Externí odkaz:
http://arxiv.org/abs/2409.05539
Autor:
Chayti, El Mahdi, Jaggi, Martin
Learning new tasks by drawing on prior experience gathered from other (related) tasks is a core property of any intelligent system. Gradient-based meta-learning, especially MAML and its variants, has emerged as a viable solution to accomplish this go
Externí odkaz:
http://arxiv.org/abs/2409.03682
Autor:
Borges, Beatriz, Foroutan, Negar, Bayazit, Deniz, Sotnikova, Anna, Montariol, Syrielle, Nazaretzky, Tanya, Banaei, Mohammadreza, Sakhaeirad, Alireza, Servant, Philippe, Neshaei, Seyed Parsa, Frej, Jibril, Romanou, Angelika, Weiss, Gail, Mamooler, Sepideh, Chen, Zeming, Fan, Simin, Gao, Silin, Ismayilzada, Mete, Paul, Debjit, Schöpfer, Alexandre, Janchevski, Andrej, Tiede, Anja, Linden, Clarence, Troiani, Emanuele, Salvi, Francesco, Behrens, Freya, Orsi, Giacomo, Piccioli, Giovanni, Sevel, Hadrien, Coulon, Louis, Pineros-Rodriguez, Manuela, Bonnassies, Marin, Hellich, Pierre, van Gerwen, Puck, Gambhir, Sankalp, Pirelli, Solal, Blanchard, Thomas, Callens, Timothée, Aoun, Toni Abi, Alonso, Yannick Calvino, Cho, Yuri, Chiappa, Alberto, Sclocchi, Antonio, Bruno, Étienne, Hofhammer, Florian, Pescia, Gabriel, Rizk, Geovani, Dadi, Leello, Stoffl, Lucas, Ribeiro, Manoel Horta, Bovel, Matthieu, Pan, Yueyang, Radenovic, Aleksandra, Alahi, Alexandre, Mathis, Alexander, Bitbol, Anne-Florence, Faltings, Boi, Hébert, Cécile, Tuia, Devis, Maréchal, François, Candea, George, Carleo, Giuseppe, Chappelier, Jean-Cédric, Flammarion, Nicolas, Fürbringer, Jean-Marie, Pellet, Jean-Philippe, Aberer, Karl, Zdeborová, Lenka, Salathé, Marcel, Jaggi, Martin, Rajman, Martin, Payer, Mathias, Wyart, Matthieu, Gastpar, Michael, Ceriotti, Michele, Svensson, Ola, Lévêque, Olivier, Ienne, Paolo, Guerraoui, Rachid, West, Robert, Kashyap, Sanidhya, Piazza, Valerio, Simanis, Viesturs, Kuncak, Viktor, Cevher, Volkan, Schwaller, Philippe, Friedli, Sacha, Jermann, Patrick, Kaser, Tanja, Bosselut, Antoine
AI assistants are being increasingly used by students enrolled in higher education institutions. While these tools provide opportunities for improved teaching and education, they also pose significant challenges for assessment and learning outcomes.
Externí odkaz:
http://arxiv.org/abs/2408.11841
Autor:
Harma, Simla Burcu, Chakraborty, Ayan, Kostenok, Elizaveta, Mishin, Danila, Ha, Dongho, Falsafi, Babak, Jaggi, Martin, Liu, Ming, Oh, Yunho, Subramanian, Suvinay, Yazdanbakhsh, Amir
The increasing size of deep neural networks necessitates effective model compression to improve computational efficiency and reduce their memory footprint. Sparsity and quantization are two prominent compression methods that have individually demonst
Externí odkaz:
http://arxiv.org/abs/2405.20935