Zobrazeno 1 - 10
of 33
pro vyhledávání: '"Pickett, Marc"'
Modern machine learning systems have demonstrated substantial abilities with methods that either embrace or ignore human-provided knowledge, but combining benefits of both styles remains a challenge. One particular challenge involves designing learni
Externí odkaz:
http://arxiv.org/abs/2408.04242
A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG), which inserts text retrieved from a larger memory into an LLM's context window. However, the context window is typically limited to several
Externí odkaz:
http://arxiv.org/abs/2407.12101
Despite their nearly universal adoption for large language models, the internal workings of transformers are not well understood. We aim to better understand the impact of removing or reorganizing information throughout the layers of a pretrained tra
Externí odkaz:
http://arxiv.org/abs/2407.09298
Autor:
Thoppilan, Romal, De Freitas, Daniel, Hall, Jamie, Shazeer, Noam, Kulshreshtha, Apoorv, Cheng, Heng-Tze, Jin, Alicia, Bos, Taylor, Baker, Leslie, Du, Yu, Li, YaGuang, Lee, Hongrae, Zheng, Huaixiu Steven, Ghafouri, Amin, Menegali, Marcelo, Huang, Yanping, Krikun, Maxim, Lepikhin, Dmitry, Qin, James, Chen, Dehao, Xu, Yuanzhong, Chen, Zhifeng, Roberts, Adam, Bosma, Maarten, Zhao, Vincent, Zhou, Yanqi, Chang, Chung-Ching, Krivokon, Igor, Rusch, Will, Pickett, Marc, Srinivasan, Pranesh, Man, Laichee, Meier-Hellstern, Kathleen, Morris, Meredith Ringel, Doshi, Tulsee, Santos, Renelito Delos, Duke, Toju, Soraker, Johnny, Zevenbergen, Ben, Prabhakaran, Vinodkumar, Diaz, Mark, Hutchinson, Ben, Olson, Kristen, Molina, Alejandra, Hoffman-John, Erin, Lee, Josh, Aroyo, Lora, Rajakumar, Ravi, Butryna, Alena, Lamm, Matthew, Kuzmina, Viktoriya, Fenton, Joe, Cohen, Aaron, Bernstein, Rachel, Kurzweil, Ray, Aguera-Arcas, Blaise, Cui, Claire, Croak, Marian, Chi, Ed, Le, Quoc
We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. W
Externí odkaz:
http://arxiv.org/abs/2201.08239
Autor:
Lomonaco, Vincenzo, Pellegrini, Lorenzo, Rodriguez, Pau, Caccia, Massimo, She, Qi, Chen, Yu, Jodelet, Quentin, Wang, Ruiping, Mai, Zheda, Vazquez, David, Parisi, German I., Churamani, Nikhil, Pickett, Marc, Laradji, Issam, Maltoni, Davide
In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the signific
Externí odkaz:
http://arxiv.org/abs/2009.09929
Domain knowledge can often be encoded in the structure of a network, such as convolutional layers for vision, which has been shown to increase generalization and decrease sample complexity, or the number of samples required for successful learning. I
Externí odkaz:
http://arxiv.org/abs/1707.03979
Autor:
Lomonaco, Vincenzo, Pellegrini, Lorenzo, Rodriguez, Pau, Caccia, Massimo, She, Qi, Chen, Yu, Jodelet, Quentin, Wang, Ruiping, Mai, Zheda, Vazquez, David, Parisi, German I., Churamani, Nikhil, Pickett, Marc, Laradji, Issam, Maltoni, Davide
Publikováno v:
In Artificial Intelligence February 2022 303
The long-term memory of most connectionist systems lies entirely in the weights of the system. Since the number of weights is typically fixed, this bounds the total amount of knowledge that can be learned and stored. Though this is not normally a pro
Externí odkaz:
http://arxiv.org/abs/1610.06402
We investigate the task of modeling open-domain, multi-turn, unstructured, multi-participant, conversational dialogue. We specifically study the effect of incorporating different elements of the conversation. Unlike previous efforts, which focused on
Externí odkaz:
http://arxiv.org/abs/1606.00372
Autor:
Pickett, Marc, Aha, David W.
Most computational models of analogy assume they are given a delineated source domain and often a specified target domain. These systems do not address how analogs can be isolated from large domains and spontaneously retrieved from long-term memory,
Externí odkaz:
http://arxiv.org/abs/1310.2955