Zobrazeno 1 - 10
of 1 121
pro vyhledávání: '"Shamis, A"'
Autor:
Fernandez, Jared, Wehrstedt, Luca, Shamis, Leonid, Elhoushi, Mostafa, Saladi, Kalyan, Bisk, Yonatan, Strubell, Emma, Kahn, Jacob
Dramatic increases in the capabilities of neural network models in recent years are driven by scaling model size, training data, and corresponding computational resources. To develop the exceedingly large networks required in modern applications, suc
Externí odkaz:
http://arxiv.org/abs/2411.13055
Autor:
Lee, Yejin, Sun, Anna, Hosmer, Basil, Acun, Bilge, Balioglu, Can, Wang, Changhan, Hernandez, Charles David, Puhrsch, Christian, Haziza, Daniel, Guessous, Driss, Massa, Francisco, Kahn, Jacob, Wan, Jeffrey, Reizenstein, Jeremy, Zhai, Jiaqi, Isaacson, Joe, Schlosser, Joel, Pino, Juan, Sadagopan, Kaushik Ram, Shamis, Leonid, Ma, Linjian, Hwang, Min-Jae, Chen, Mingda, Elhoushi, Mostafa, Rodriguez, Pedro, Pasunuru, Ram, Yih, Scott, Popuri, Sravya, Liu, Xing, Wu, Carole-Jean
Generative artificial intelligence (AI) technology is revolutionizing the computing industry. Not only its applications have broadened to various sectors but also poses new system design and optimization opportunities. The technology is capable of un
Externí odkaz:
http://arxiv.org/abs/2410.00215
Autor:
Zhou, Chunting, Yu, Lili, Babu, Arun, Tirumala, Kushal, Yasunaga, Michihiro, Shamis, Leonid, Kahn, Jacob, Ma, Xuezhe, Zettlemoyer, Luke, Levy, Omer
We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data. Transfusion combines the language modeling loss function (next token prediction) with diffusion to train a single transformer over mixed-modality s
Externí odkaz:
http://arxiv.org/abs/2408.11039
Autor:
Nvidia, Adler, Bo, Agarwal, Niket, Aithal, Ashwath, Anh, Dong H., Bhattacharya, Pallab, Brundyn, Annika, Casper, Jared, Catanzaro, Bryan, Clay, Sharon, Cohen, Jonathan, Das, Sirshak, Dattagupta, Ayush, Delalleau, Olivier, Derczynski, Leon, Dong, Yi, Egert, Daniel, Evans, Ellie, Ficek, Aleksander, Fridman, Denys, Ghosh, Shaona, Ginsburg, Boris, Gitman, Igor, Grzegorzek, Tomasz, Hero, Robert, Huang, Jining, Jawa, Vibhu, Jennings, Joseph, Jhunjhunwala, Aastha, Kamalu, John, Khan, Sadaf, Kuchaiev, Oleksii, LeGresley, Patrick, Li, Hui, Liu, Jiwei, Liu, Zihan, Long, Eileen, Mahabaleshwarkar, Ameya Sunil, Majumdar, Somshubra, Maki, James, Martinez, Miguel, de Melo, Maer Rodrigues, Moshkov, Ivan, Narayanan, Deepak, Narenthiran, Sean, Navarro, Jesus, Nguyen, Phong, Nitski, Osvald, Noroozi, Vahid, Nutheti, Guruprasad, Parisien, Christopher, Parmar, Jupinder, Patwary, Mostofa, Pawelec, Krzysztof, Ping, Wei, Prabhumoye, Shrimai, Roy, Rajarshi, Saar, Trisha, Sabavat, Vasanth Rao Naik, Satheesh, Sanjeev, Scowcroft, Jane Polak, Sewall, Jason, Shamis, Pavel, Shen, Gerald, Shoeybi, Mohammad, Sizer, Dave, Smelyanskiy, Misha, Soares, Felipe, Sreedhar, Makesh Narsimhan, Su, Dan, Subramanian, Sandeep, Sun, Shengyang, Toshniwal, Shubham, Wang, Hao, Wang, Zhilin, You, Jiaxuan, Zeng, Jiaqi, Zhang, Jimmy, Zhang, Jing, Zhang, Vivienne, Zhang, Yian, Zhu, Chen
We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distri
Externí odkaz:
http://arxiv.org/abs/2406.11704
In this paper, we present a framework for moving compute and data between processing elements in a distributed heterogeneous system. The implementation of the framework is based on the LLVM compiler toolchain combined with the UCX communication frame
Externí odkaz:
http://arxiv.org/abs/2208.01154
Marketplaces for machine learning (ML) models are emerging as a way for organizations to monetize models. They allow model owners to retain control over hosted models by using cloud resources to execute ML inference requests for a fee, preserving mod
Externí odkaz:
http://arxiv.org/abs/2205.15757
Hardware-based Trusted Execution Environments (TEEs) are becoming increasingly prevalent in cloud computing, forming the basis for confidential computing. However, the security goals of TEEs sometimes conflict with existing cloud functionality, such
Externí odkaz:
http://arxiv.org/abs/2205.15359
Autor:
Shamis, Mira, Sodin, Sasha
Motivated by the research on upper bounds on the rate of quantum transport for one-dimensional operators, particularly, the recent works of Jitomirskaya--Liu and Jitomirskaya--Powell and the earlier ones of Damanik--Tcheremchantsev, we propose a meth
Externí odkaz:
http://arxiv.org/abs/2111.10902
We show that some spectral properties of the almost Mathieu operator with frequency well approximated by rationals can be as poor as at all possible in the class of all one-dimensional discrete Schroedinger operators. For the class of critical coupli
Externí odkaz:
http://arxiv.org/abs/2110.07974
Network library APIs have historically been developed with the emphasis on data movement, placement, and communication semantics. Many communication semantics are available across a large variety of network libraries, such as send-receive, data strea
Externí odkaz:
http://arxiv.org/abs/2110.06292