Zobrazeno 1 - 10
of 77
pro vyhledávání: '"CHEN Dehao"'
Autor:
Thoppilan, Romal, De Freitas, Daniel, Hall, Jamie, Shazeer, Noam, Kulshreshtha, Apoorv, Cheng, Heng-Tze, Jin, Alicia, Bos, Taylor, Baker, Leslie, Du, Yu, Li, YaGuang, Lee, Hongrae, Zheng, Huaixiu Steven, Ghafouri, Amin, Menegali, Marcelo, Huang, Yanping, Krikun, Maxim, Lepikhin, Dmitry, Qin, James, Chen, Dehao, Xu, Yuanzhong, Chen, Zhifeng, Roberts, Adam, Bosma, Maarten, Zhao, Vincent, Zhou, Yanqi, Chang, Chung-Ching, Krivokon, Igor, Rusch, Will, Pickett, Marc, Srinivasan, Pranesh, Man, Laichee, Meier-Hellstern, Kathleen, Morris, Meredith Ringel, Doshi, Tulsee, Santos, Renelito Delos, Duke, Toju, Soraker, Johnny, Zevenbergen, Ben, Prabhakaran, Vinodkumar, Diaz, Mark, Hutchinson, Ben, Olson, Kristen, Molina, Alejandra, Hoffman-John, Erin, Lee, Josh, Aroyo, Lora, Rajakumar, Ravi, Butryna, Alena, Lamm, Matthew, Kuzmina, Viktoriya, Fenton, Joe, Cohen, Aaron, Bernstein, Rachel, Kurzweil, Ray, Aguera-Arcas, Blaise, Cui, Claire, Croak, Marian, Chi, Ed, Le, Quoc
We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. W
Externí odkaz:
http://arxiv.org/abs/2201.08239
Publikováno v:
In Sustainable Cities and Society 1 October 2024 112
Autor:
Xu, Yuanzhong, Lee, HyoukJoong, Chen, Dehao, Hechtman, Blake, Huang, Yanping, Joshi, Rahul, Krikun, Maxim, Lepikhin, Dmitry, Ly, Andy, Maggioni, Marcello, Pang, Ruoming, Shazeer, Noam, Wang, Shibo, Wang, Tao, Wu, Yonghui, Chen, Zhifeng
We present GSPMD, an automatic, compiler-based parallelization system for common machine learning computations. It allows users to write programs in the same way as for a single device, then give hints through a few annotations on how to distribute t
Externí odkaz:
http://arxiv.org/abs/2105.04663
Publikováno v:
In Epidemics March 2024 46
Autor:
Chen, Dehao, Mo, Zhenwu, Liang, Zehong, Jiang, Junjie, Tang, Huilin, Sun, Yidan, Wang, Ziyu, Wei, Quanfeng, Chen, Yanru, Deng, Dongmei
Publikováno v:
In Optics Communications 1 March 2024 554
Autor:
Kumar, Sameer, Bradbury, James, Young, Cliff, Wang, Yu Emma, Levskaya, Anselm, Hechtman, Blake, Chen, Dehao, Lee, HyoukJoong, Deveci, Mehmet, Kumar, Naveen, Kanwar, Pankaj, Wang, Shibo, Wanderman-Milne, Skye, Lacy, Steve, Wang, Tao, Oguntebi, Tayo, Zu, Yazhou, Xu, Yuanzhong, Swing, Andy
Recent results in language understanding using neural networks have required training hardware of unprecedentedscale, with thousands of chips cooperating on a single training run. This paper presents techniques to scaleML models on the Google TPU Mul
Externí odkaz:
http://arxiv.org/abs/2011.03641
Autor:
Lepikhin, Dmitry, Lee, HyoukJoong, Xu, Yuanzhong, Chen, Dehao, Firat, Orhan, Huang, Yanping, Krikun, Maxim, Shazeer, Noam, Chen, Zhifeng
Neural network scaling has been critical for improving the model quality in many real-world machine learning applications with vast amounts of training data and compute. Although this trend of scaling is affirmed to be a sure-fire approach for better
Externí odkaz:
http://arxiv.org/abs/2006.16668
In data-parallel synchronous training of deep neural networks, different devices (replicas) run the same program with different partitions of the training batch, but weight update computation is repeated on all replicas, because the weights do not ha
Externí odkaz:
http://arxiv.org/abs/2004.13336
Autor:
Mattson, Peter, Cheng, Christine, Coleman, Cody, Diamos, Greg, Micikevicius, Paulius, Patterson, David, Tang, Hanlin, Wei, Gu-Yeon, Bailis, Peter, Bittorf, Victor, Brooks, David, Chen, Dehao, Dutta, Debojyoti, Gupta, Udit, Hazelwood, Kim, Hock, Andrew, Huang, Xinyuan, Ike, Atsushi, Jia, Bill, Kang, Daniel, Kanter, David, Kumar, Naveen, Liao, Jeffery, Ma, Guokai, Narayanan, Deepak, Oguntebi, Tayo, Pekhimenko, Gennady, Pentecost, Lillian, Reddi, Vijay Janapa, Robie, Taylor, John, Tom St., Tabaru, Tsuguchika, Wu, Carole-Jean, Xu, Lingjie, Yamazaki, Masafumi, Young, Cliff, Zaharia, Matei
Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML. But ML training presents three unique benchmarking challenges absent from o
Externí odkaz:
http://arxiv.org/abs/1910.01500
Autor:
Kumar, Sameer, Bitorff, Victor, Chen, Dehao, Chou, Chiachen, Hechtman, Blake, Lee, HyoukJoong, Kumar, Naveen, Mattson, Peter, Wang, Shibo, Wang, Tao, Xu, Yuanzhong, Zhou, Zongwei
The recent submission of Google TPU-v3 Pods to the industry wide MLPerf v0.6 training benchmark demonstrates the scalability of a suite of industry relevant ML models. MLPerf defines a suite of models, datasets and rules to follow when benchmarking t
Externí odkaz:
http://arxiv.org/abs/1909.09756