Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Friesen, Abe"'
Autor:
Gemma Team, Riviere, Morgane, Pathak, Shreya, Sessa, Pier Giuseppe, Hardin, Cassidy, Bhupatiraju, Surya, Hussenot, Léonard, Mesnard, Thomas, Shahriari, Bobak, Ramé, Alexandre, Ferret, Johan, Liu, Peter, Tafti, Pouya, Friesen, Abe, Casbon, Michelle, Ramos, Sabela, Kumar, Ravin, Lan, Charline Le, Jerome, Sammy, Tsitsulin, Anton, Vieillard, Nino, Stanczyk, Piotr, Girgin, Sertan, Momchev, Nikola, Hoffman, Matt, Thakoor, Shantanu, Grill, Jean-Bastien, Neyshabur, Behnam, Bachem, Olivier, Walton, Alanna, Severyn, Aliaksei, Parrish, Alicia, Ahmad, Aliya, Hutchison, Allen, Abdagic, Alvin, Carl, Amanda, Shen, Amy, Brock, Andy, Coenen, Andy, Laforge, Anthony, Paterson, Antonia, Bastian, Ben, Piot, Bilal, Wu, Bo, Royal, Brandon, Chen, Charlie, Kumar, Chintu, Perry, Chris, Welty, Chris, Choquette-Choo, Christopher A., Sinopalnikov, Danila, Weinberger, David, Vijaykumar, Dimple, Rogozińska, Dominika, Herbison, Dustin, Bandy, Elisa, Wang, Emma, Noland, Eric, Moreira, Erica, Senter, Evan, Eltyshev, Evgenii, Visin, Francesco, Rasskin, Gabriel, Wei, Gary, Cameron, Glenn, Martins, Gus, Hashemi, Hadi, Klimczak-Plucińska, Hanna, Batra, Harleen, Dhand, Harsh, Nardini, Ivan, Mein, Jacinda, Zhou, Jack, Svensson, James, Stanway, Jeff, Chan, Jetha, Zhou, Jin Peng, Carrasqueira, Joana, Iljazi, Joana, Becker, Jocelyn, Fernandez, Joe, van Amersfoort, Joost, Gordon, Josh, Lipschultz, Josh, Newlan, Josh, Ji, Ju-yeong, Mohamed, Kareem, Badola, Kartikeya, Black, Kat, Millican, Katie, McDonell, Keelin, Nguyen, Kelvin, Sodhia, Kiranbir, Greene, Kish, Sjoesund, Lars Lowe, Usui, Lauren, Sifre, Laurent, Heuermann, Lena, Lago, Leticia, McNealus, Lilly, Soares, Livio Baldini, Kilpatrick, Logan, Dixon, Lucas, Martins, Luciano, Reid, Machel, Singh, Manvinder, Iverson, Mark, Görner, Martin, Velloso, Mat, Wirth, Mateo, Davidow, Matt, Miller, Matt, Rahtz, Matthew, Watson, Matthew, Risdal, Meg, Kazemi, Mehran, Moynihan, Michael, Zhang, Ming, Kahng, Minsuk, Park, Minwoo, Rahman, Mofi, Khatwani, Mohit, Dao, Natalie, Bardoliwalla, Nenshad, Devanathan, Nesh, Dumai, Neta, Chauhan, Nilay, Wahltinez, Oscar, Botarda, Pankil, Barnes, Parker, Barham, Paul, Michel, Paul, Jin, Pengchong, Georgiev, Petko, Culliton, Phil, Kuppala, Pradeep, Comanescu, Ramona, Merhej, Ramona, Jana, Reena, Rokni, Reza Ardeshir, Agarwal, Rishabh, Mullins, Ryan, Saadat, Samaneh, Carthy, Sara Mc, Cogan, Sarah, Perrin, Sarah, Arnold, Sébastien M. R., Krause, Sebastian, Dai, Shengyang, Garg, Shruti, Sheth, Shruti, Ronstrom, Sue, Chan, Susan, Jordan, Timothy, Yu, Ting, Eccles, Tom, Hennigan, Tom, Kocisky, Tomas, Doshi, Tulsee, Jain, Vihan, Yadav, Vikas, Meshram, Vilobh, Dharmadhikari, Vishal, Barkley, Warren, Wei, Wei, Ye, Wenming, Han, Woohyun, Kwon, Woosuk, Xu, Xiang, Shen, Zhe, Gong, Zhitao, Wei, Zichuan, Cotruta, Victor, Kirk, Phoebe, Rao, Anand, Giang, Minh, Peran, Ludovic, Warkentin, Tris, Collins, Eli, Barral, Joelle, Ghahramani, Zoubin, Hadsell, Raia, Sculley, D., Banks, Jeanine, Dragan, Anca, Petrov, Slav, Vinyals, Oriol, Dean, Jeff, Hassabis, Demis, Kavukcuoglu, Koray, Farabet, Clement, Buchatskaya, Elena, Borgeaud, Sebastian, Fiedel, Noah, Joulin, Armand, Kenealy, Kathleen, Dadashi, Robert, Andreev, Alek
In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the
Externí odkaz:
http://arxiv.org/abs/2408.00118
Autor:
Sessa, Pier Giuseppe, Dadashi, Robert, Hussenot, Léonard, Ferret, Johan, Vieillard, Nino, Ramé, Alexandre, Shariari, Bobak, Perrin, Sarah, Friesen, Abe, Cideron, Geoffrey, Girgin, Sertan, Stanczyk, Piotr, Michi, Andrea, Sinopalnikov, Danila, Ramos, Sabela, Héliou, Amélie, Severyn, Aliaksei, Hoffman, Matt, Momchev, Nikola, Bachem, Olivier
Reinforcement learning from human feedback (RLHF) is a key driver of quality and safety in state-of-the-art large language models. Yet, a surprisingly simple and strong inference-time strategy is Best-of-N sampling that selects the best generation am
Externí odkaz:
http://arxiv.org/abs/2407.14622
Autor:
Shahriari, Bobak, Abdolmaleki, Abbas, Byravan, Arunkumar, Friesen, Abe, Liu, Siqi, Springenberg, Jost Tobias, Heess, Nicolas, Hoffman, Matt, Riedmiller, Martin
Actor-critic algorithms that make use of distributional policy evaluation have frequently been shown to outperform their non-distributional counterparts on many challenging control tasks. Examples of this behavior include the D4PG and DMPO algorithms
Externí odkaz:
http://arxiv.org/abs/2204.10256
Autor:
Hoffman, Matthew W., Shahriari, Bobak, Aslanides, John, Barth-Maron, Gabriel, Momchev, Nikola, Sinopalnikov, Danila, Stańczyk, Piotr, Ramos, Sabela, Raichuk, Anton, Vincent, Damien, Hussenot, Léonard, Dadashi, Robert, Dulac-Arnold, Gabriel, Orsini, Manu, Jacq, Alexis, Ferret, Johan, Vieillard, Nino, Ghasemipour, Seyed Kamyar Seyed, Girgin, Sertan, Pietquin, Olivier, Behbahani, Feryal, Norman, Tamara, Abdolmaleki, Abbas, Cassirer, Albin, Yang, Fan, Baumli, Kate, Henderson, Sarah, Friesen, Abe, Haroun, Ruba, Novikov, Alex, Colmenarejo, Sergio Gómez, Cabi, Serkan, Gulcehre, Caglar, Paine, Tom Le, Srinivasan, Srivatsan, Cowie, Andrew, Wang, Ziyu, Piot, Bilal, de Freitas, Nando
Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL a
Externí odkaz:
http://arxiv.org/abs/2006.00979