Výsledky vyhledávání - "Botev, P. A."

Report

Group COMBSS: Group Selection via Continuous Optimization

Autor: Mathur, Anant, Moka, Sarat, Liquet, Benoit, Botev, Zdravko

We present a new optimization method for the group selection problem in linear regression. In this problem, predictors are assumed to have a natural group structure and the goal is to select a small set of groups that best fits the response. The inco

Externí odkaz: http://arxiv.org/abs/2404.13339

Zobrazit plný text záznamu

Report

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Autor: Botev, Aleksandar, De, Soham, Smith, Samuel L, Fernando, Anushan, Muraru, George-Cristian, Haroun, Ruba, Berrada, Leonard, Pascanu, Razvan, Sessa, Pier Giuseppe, Dadashi, Robert, Hussenot, Léonard, Ferret, Johan, Girgin, Sertan, Bachem, Olivier, Andreev, Alek, Kenealy, Kathleen, Mesnard, Thomas, Hardin, Cassidy, Bhupatiraju, Surya, Pathak, Shreya, Sifre, Laurent, Rivière, Morgane, Kale, Mihir Sanjay, Love, Juliette, Tafti, Pouya, Joulin, Armand, Fiedel, Noah, Senter, Evan, Chen, Yutian, Srinivasan, Srivatsan, Desjardins, Guillaume, Budden, David, Doucet, Arnaud, Vikram, Sharad, Paszke, Adam, Gale, Trevor, Borgeaud, Sebastian, Chen, Charlie, Brock, Andy, Paterson, Antonia, Brennan, Jenny, Risdal, Meg, Gundluru, Raj, Devanathan, Nesh, Mooney, Paul, Chauhan, Nilay, Culliton, Phil, Martins, Luiz Gustavo, Bandy, Elisa, Huntsperger, David, Cameron, Glenn, Zucker, Arthur, Warkentin, Tris, Peran, Ludovic, Giang, Minh, Ghahramani, Zoubin, Farabet, Clément, Kavukcuoglu, Koray, Hassabis, Demis, Hadsell, Raia, Teh, Yee Whye, de Frietas, Nando

We introduce RecurrentGemma, a family of open language models which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which red

Externí odkaz: http://arxiv.org/abs/2404.07839

Zobrazit plný text záznamu

Report

Gemma: Open Models Based on Gemini Research and Technology

Autor: Gemma Team, Mesnard, Thomas, Hardin, Cassidy, Dadashi, Robert, Bhupatiraju, Surya, Pathak, Shreya, Sifre, Laurent, Rivière, Morgane, Kale, Mihir Sanjay, Love, Juliette, Tafti, Pouya, Hussenot, Léonard, Sessa, Pier Giuseppe, Chowdhery, Aakanksha, Roberts, Adam, Barua, Aditya, Botev, Alex, Castro-Ros, Alex, Slone, Ambrose, Héliou, Amélie, Tacchetti, Andrea, Bulanova, Anna, Paterson, Antonia, Tsai, Beth, Shahriari, Bobak, Lan, Charline Le, Choquette-Choo, Christopher A., Crepy, Clément, Cer, Daniel, Ippolito, Daphne, Reid, David, Buchatskaya, Elena, Ni, Eric, Noland, Eric, Yan, Geng, Tucker, George, Muraru, George-Christian, Rozhdestvenskiy, Grigory, Michalewski, Henryk, Tenney, Ian, Grishchenko, Ivan, Austin, Jacob, Keeling, James, Labanowski, Jane, Lespiau, Jean-Baptiste, Stanway, Jeff, Brennan, Jenny, Chen, Jeremy, Ferret, Johan, Chiu, Justin, Mao-Jones, Justin, Lee, Katherine, Yu, Kathy, Millican, Katie, Sjoesund, Lars Lowe, Lee, Lisa, Dixon, Lucas, Reid, Machel, Mikuła, Maciej, Wirth, Mateo, Sharman, Michael, Chinaev, Nikolai, Thain, Nithum, Bachem, Olivier, Chang, Oscar, Wahltinez, Oscar, Bailey, Paige, Michel, Paul, Yotov, Petko, Chaabouni, Rahma, Comanescu, Ramona, Jana, Reena, Anil, Rohan, McIlroy, Ross, Liu, Ruibo, Mullins, Ryan, Smith, Samuel L, Borgeaud, Sebastian, Girgin, Sertan, Douglas, Sholto, Pandya, Shree, Shakeri, Siamak, De, Soham, Klimenko, Ted, Hennigan, Tom, Feinberg, Vlad, Stokowiec, Wojciech, Chen, Yu-hui, Ahmed, Zafarali, Gong, Zhitao, Warkentin, Tris, Peran, Ludovic, Giang, Minh, Farabet, Clément, Vinyals, Oriol, Dean, Jeff, Kavukcuoglu, Koray, Hassabis, Demis, Ghahramani, Zoubin, Eck, Douglas, Barral, Joelle, Pereira, Fernando, Collins, Eli, Joulin, Armand, Fiedel, Noah, Senter, Evan, Andreev, Alek, Kenealy, Kathleen

This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding,

Externí odkaz: http://arxiv.org/abs/2403.08295

Zobrazit plný text záznamu

Report

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Autor: De, Soham, Smith, Samuel L., Fernando, Anushan, Botev, Aleksandar, Cristian-Muraru, George, Gu, Albert, Haroun, Ruba, Berrada, Leonard, Chen, Yutian, Srinivasan, Srivatsan, Desjardins, Guillaume, Doucet, Arnaud, Budden, David, Teh, Yee Whye, Pascanu, Razvan, De Freitas, Nando, Gulcehre, Caglar

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale. We propose Hawk, an RNN with gated linear recurrences, and Griffin, a hybrid model that mixes gated linea

Externí odkaz: http://arxiv.org/abs/2402.19427

Zobrazit plný text záznamu

Report

Applications of flow models to the generation of correlated lattice QCD ensembles

Autor: Abbott, Ryan, Botev, Aleksandar, Boyda, Denis, Hackett, Daniel C., Kanwar, Gurtej, Racanière, Sébastien, Rezende, Danilo J., Romero-López, Fernando, Shanahan, Phiala E., Urban, Julian M.

Machine-learned normalizing flows can be used in the context of lattice quantum field theory to generate statistically correlated ensembles of lattice gauge fields at different action parameters. This work demonstrates how these correlations can be e

Externí odkaz: http://arxiv.org/abs/2401.10874

Zobrazit plný text záznamu

Report

Generalized Linear Models via the Lasso: To Scale or Not to Scale?

Autor: Mathur, Anant, Moka, Sarat, Botev, Zdravko

The Lasso regression is a popular regularization method for feature selection in statistics. Prior to computing the Lasso estimator in both linear and generalized linear models, it is common to conduct a preliminary rescaling of the feature matrix to

Externí odkaz: http://arxiv.org/abs/2311.11236

Zobrazit plný text záznamu

Report

Normalizing flows for lattice gauge theory in arbitrary space-time dimension

Autor: Abbott, Ryan, Albergo, Michael S., Botev, Aleksandar, Boyda, Denis, Cranmer, Kyle, Hackett, Daniel C., Kanwar, Gurtej, Matthews, Alexander G. D. G., Racanière, Sébastien, Razavi, Ali, Rezende, Danilo J., Romero-López, Fernando, Shanahan, Phiala E., Urban, Julian M.

Applications of normalizing flows to the sampling of field configurations in lattice gauge theory have so far been explored almost exclusively in two space-time dimensions. We report new algorithmic developments of gauge-equivariant flow architecture

Externí odkaz: http://arxiv.org/abs/2305.02402

Zobrazit plný text záznamu

Report

Column Subset Selection and Nystr\'om Approximation via Continuous Optimization

Autor: Mathur, Anant, Moka, Sarat, Botev, Zdravko

We propose a continuous optimization algorithm for the Column Subset Selection Problem (CSSP) and Nystr\"om approximation. The CSSP and Nystr\"om method construct low-rank approximations of matrices based on a predetermined subset of columns. It is w

Externí odkaz: http://arxiv.org/abs/2304.09678

Zobrazit plný text záznamu

Akademický článek

Comparative evaluation of surgical procedures for trigeminal neuralgia: a literature review

Autor: Vyacheslav S. Botev, Yurii V. Hryniv, Viktoria A. Gryb

Publikováno v: Ukrainian Neurosurgical Journal, Vol 30, Iss 3, Pp 3-17 (2024)

Trigeminal Neuralgia (TN) has been described in the literature as one of the commonest types of craniofacial pain disorders. TN refers to recurrent lancinating pain that occurs in the distribution of one or more branches of the fifth cranial nerve. T

Externí odkaz: https://doaj.org/article/730c47bed7c64bb3b7714ed163aa7495

Zobrazit plný text záznamu

Report

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

Autor: He, Bobby, Martens, James, Zhang, Guodong, Botev, Aleksandar, Brock, Andrew, Smith, Samuel L, Teh, Yee Whye

Skip connections and normalisation layers form two standard architectural components that are ubiquitous for the training of Deep Neural Networks (DNNs), but whose precise roles are poorly understood. Recent approaches such as Deep Kernel Shaping hav

Externí odkaz: http://arxiv.org/abs/2302.10322

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání