Zobrazeno 1 - 10
of 520
pro vyhledávání: '"Monteiro, João P"'
Autor:
Monteiro, Joao, Noel, Pierre-Andre, Marcotte, Etienne, Rajeswar, Sai, Zantedeschi, Valentina, Vazquez, David, Chapados, Nicolas, Pal, Christopher, Taslakian, Perouz
Large Language Models (LLMs) are trained on vast amounts of data, most of which is automatically scraped from the internet. This data includes encyclopedic documents that harbor a vast amount of general knowledge (e.g., Wikipedia) but also potentiall
Externí odkaz:
http://arxiv.org/abs/2406.11811
Autor:
Monteiro, João, Marcotte, Étienne, Noël, Pierre-André, Zantedeschi, Valentina, Vázquez, David, Chapados, Nicolas, Pal, Christopher, Taslakian, Perouz
In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference information. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations
Externí odkaz:
http://arxiv.org/abs/2404.15420
Empirical risk minimization (ERM) is sensitive to spurious correlations in the training data, which poses a significant risk when deploying systems trained under this paradigm in high-stake applications. While the existing literature focuses on maxim
Externí odkaz:
http://arxiv.org/abs/2310.18555
Autor:
Guille-Escuret, Charles, Noël, Pierre-André, Mitliagkas, Ioannis, Vazquez, David, Monteiro, Joao
Improving the reliability of deployed machine learning systems often involves developing methods to detect out-of-distribution (OOD) inputs. However, existing research often narrowly focuses on samples from classes that are absent from the training s
Externí odkaz:
http://arxiv.org/abs/2308.11480
Autor:
Li, Raymond, Allal, Loubna Ben, Zi, Yangtian, Muennighoff, Niklas, Kocetkov, Denis, Mou, Chenghao, Marone, Marc, Akiki, Christopher, Li, Jia, Chim, Jenny, Liu, Qian, Zheltonozhskii, Evgenii, Zhuo, Terry Yue, Wang, Thomas, Dehaene, Olivier, Davaadorj, Mishig, Lamy-Poirier, Joel, Monteiro, João, Shliazhko, Oleh, Gontier, Nicolas, Meade, Nicholas, Zebaze, Armel, Yee, Ming-Ho, Umapathi, Logesh Kumar, Zhu, Jian, Lipkin, Benjamin, Oblokulov, Muhtasham, Wang, Zhiruo, Murthy, Rudra, Stillerman, Jason, Patel, Siva Sankalp, Abulkhanov, Dmitry, Zocca, Marco, Dey, Manan, Zhang, Zhihan, Fahmy, Nour, Bhattacharyya, Urvashi, Yu, Wenhao, Singh, Swayam, Luccioni, Sasha, Villegas, Paulo, Kunakov, Maxim, Zhdanov, Fedor, Romero, Manuel, Lee, Tony, Timor, Nadav, Ding, Jennifer, Schlesinger, Claire, Schoelkopf, Hailey, Ebert, Jan, Dao, Tri, Mishra, Mayank, Gu, Alex, Robinson, Jennifer, Anderson, Carolyn Jane, Dolan-Gavitt, Brendan, Contractor, Danish, Reddy, Siva, Fried, Daniel, Bahdanau, Dzmitry, Jernite, Yacine, Ferrandis, Carlos Muñoz, Hughes, Sean, Wolf, Thomas, Guha, Arjun, von Werra, Leandro, de Vries, Harm
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilitie
Externí odkaz:
http://arxiv.org/abs/2305.06161
Autor:
Neto, Pedro C., Montezuma, Diana, Oliveira, Sara P., Oliveira, Domingos, Fraga, João, Monteiro, Ana, Monteiro, João, Ribeiro, Liliana, Gonçalves, Sofia, Reinhard, Stefan, Zlobec, Inti, Pinto, Isabel M., Cardoso, Jaime S.
Publikováno v:
npj Precis. Onc. 8, 56 (2024)
Considering the profound transformation affecting pathology practice, we aimed to develop a scalable artificial intelligence (AI) system to diagnose colorectal cancer from whole-slide images (WSI). For this, we propose a deep learning (DL) system tha
Externí odkaz:
http://arxiv.org/abs/2301.02608
Publikováno v:
Advances in Neural Information Processing Systems 36 (2024)
Handling out-of-distribution (OOD) samples has become a major stake in the real-world deployment of machine learning systems. This work explores the use of self-supervised contrastive learning to the simultaneous detection of two types of OOD samples
Externí odkaz:
http://arxiv.org/abs/2210.01742
A well-known failure mode of neural networks is that they may confidently return erroneous predictions. Such unsafe behaviour is particularly frequent when the use case slightly differs from the training context, and/or in the presence of an adversar
Externí odkaz:
http://arxiv.org/abs/2208.14488
We study settings where gradient penalties are used alongside risk minimization with the goal of obtaining predictors satisfying different notions of monotonicity. Specifically, we present two sets of contributions. In the first part of the paper, we
Externí odkaz:
http://arxiv.org/abs/2205.08247
Learning guarantees often rely on assumptions of i.i.d. data, which will likely be violated in practice once predictors are deployed to perform real-world tasks. Domain adaptation approaches thus appeared as a useful framework yielding extra flexibil
Externí odkaz:
http://arxiv.org/abs/2106.13899