Výsledky vyhledávání - "Cheng, Youlong"

Report

Monolith: Real Time Recommendation System With Collisionless Embedding Table

Autor: Liu, Zhuoran, Zou, Leqi, Zou, Xuan, Wang, Caihua, Zhang, Biao, Tang, Da, Zhu, Bolin, Zhu, Yijie, Wu, Peng, Wang, Ke, Cheng, Youlong

Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads. Despite the ubiquitous adoption of production-scale deep learning frameworks

Externí odkaz: http://arxiv.org/abs/2209.07663

Zobrazit plný text záznamu

Report

CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU

Autor: Zheng, Zangwei, Xu, Pengtai, Zou, Xuan, Tang, Da, Li, Zhen, Xi, Chenguang, Wu, Peng, Zou, Leqi, Zhu, Yijie, Chen, Ming, Ding, Xiangzhuo, Xue, Fuzhao, Qin, Ziheng, Cheng, Youlong, You, Yang

The click-through rate (CTR) prediction task is to predict whether a user will click on the recommended item. As mind-boggling amounts of data are produced online daily, accelerating CTR prediction model training is critical to ensuring an up-to-date

Externí odkaz: http://arxiv.org/abs/2204.06240

Zobrazit plný text záznamu

Report

Enhanced Exploration in Neural Feature Selection for Deep Click-Through Rate Prediction Models via Ensemble of Gating Layers

Autor: Guan, Lin, Xiao, Xia, Chen, Ming, Cheng, Youlong

Feature selection has been an essential step in developing industry-scale deep Click-Through Rate (CTR) prediction systems. The goal of neural feature selection (NFS) is to choose a relatively small subset of features with the best explanatory power

Externí odkaz: http://arxiv.org/abs/2112.03487

Zobrazit plný text záznamu

Report

Toward Annotator Group Bias in Crowdsourcing

Autor: Liu, Haochen, Thekinen, Joseph, Mollaoglu, Sinem, Tang, Da, Yang, Ji, Cheng, Youlong, Liu, Hui, Tang, Jiliang

Crowdsourcing has emerged as a popular approach for collecting annotated data to train supervised machine learning models. However, annotator bias can lead to defective annotations. Though there are a few works investigating individual annotator bias

Externí odkaz: http://arxiv.org/abs/2110.08038

Zobrazit plný text záznamu

Report

Talking-Heads Attention

Autor: Shazeer, Noam, Lan, Zhenzhong, Cheng, Youlong, Ding, Nan, Hou, Le

We introduce "talking-heads attention" - a variation on multi-head attention which includes linearprojections across the attention-heads dimension, immediately before and after the softmax operation.While inserting only a small number of additional p

Externí odkaz: http://arxiv.org/abs/2003.02436

Zobrazit plný text záznamu

Report

High Resolution Medical Image Analysis with Spatial Partitioning

Autor: Hou, Le, Cheng, Youlong, Shazeer, Noam, Parmar, Niki, Li, Yeqing, Korfiatis, Panagiotis, Drucker, Travis M., Blezek, Daniel J., Song, Xiaodan

Medical images such as 3D computerized tomography (CT) scans and pathology images, have hundreds of millions or billions of voxels/pixels. It is infeasible to train CNN models directly on such high resolution images, because neural activations of a s

Externí odkaz: http://arxiv.org/abs/1909.03108

Zobrazit plný text záznamu

Report

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Autor: Shen, Jonathan, Nguyen, Patrick, Wu, Yonghui, Chen, Zhifeng, Chen, Mia X., Jia, Ye, Kannan, Anjuli, Sainath, Tara, Cao, Yuan, Chiu, Chung-Cheng, He, Yanzhang, Chorowski, Jan, Hinsu, Smit, Laurenzo, Stella, Qin, James, Firat, Orhan, Macherey, Wolfgang, Gupta, Suyog, Bapna, Ankur, Zhang, Shuyuan, Pang, Ruoming, Weiss, Ron J., Prabhavalkar, Rohit, Liang, Qiao, Jacob, Benoit, Liang, Bowen, Lee, HyoukJoong, Chelba, Ciprian, Jean, Sébastien, Li, Bo, Johnson, Melvin, Anil, Rohan, Tibrewal, Rajat, Liu, Xiaobing, Eriguchi, Akiko, Jaitly, Navdeep, Ari, Naveen, Cherry, Colin, Haghani, Parisa, Good, Otavio, Cheng, Youlong, Alvarez, Raziel, Caswell, Isaac, Hsu, Wei-Ning, Yang, Zongheng, Wang, Kuan-Chieh, Gonina, Ekaterina, Tomanek, Katrin, Vanik, Ben, Wu, Zelin, Jones, Llion, Schuster, Mike, Huang, Yanping, Chen, Dehao, Irie, Kazuki, Foster, George, Richardson, John, Macherey, Klaus, Bruguier, Antoine, Zen, Heiga, Raffel, Colin, Kumar, Shankar, Rao, Kanishka, Rybach, David, Murray, Matthew, Peddinti, Vijayaditya, Krikun, Maxim, Bacchiani, Michiel A. U., Jablin, Thomas B., Suderman, Rob, Williams, Ian, Lee, Benjamin, Bhatia, Deepti, Carlson, Justin, Yavuz, Semih, Zhang, Yu, McGraw, Ian, Galkin, Max, Ge, Qi, Pundak, Golan, Whipkey, Chad, Wang, Todd, Alon, Uri, Lepikhin, Dmitry, Tian, Ye, Sabour, Sara, Chan, William, Toshniwal, Shubham, Liao, Baohua, Nirschl, Michael, Rondon, Pat

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily ex

Externí odkaz: http://arxiv.org/abs/1902.08295

Zobrazit plný text záznamu

Report

Image Classification at Supercomputer Scale

Autor: Ying, Chris, Kumar, Sameer, Chen, Dehao, Wang, Tao, Cheng, Youlong

Deep learning is extremely computationally intensive, and hardware vendors have responded by building faster accelerators in large clusters. Training deep learning models at petaFLOPS scale requires overcoming both algorithmic and systems software ch

Externí odkaz: http://arxiv.org/abs/1811.06992

Zobrazit plný text záznamu

Report

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

Autor: Huang, Yanping, Cheng, Youlong, Bapna, Ankur, Firat, Orhan, Chen, Mia Xu, Chen, Dehao, Lee, HyoukJoong, Ngiam, Jiquan, Le, Quoc V., Wu, Yonghui, Chen, Zhifeng

Scaling up deep neural network capacity has been known as an effective approach to improving model quality for several different machine learning tasks. In many cases, increasing model capacity beyond the memory limit of a single accelerator has requ

Externí odkaz: http://arxiv.org/abs/1811.06965

Zobrazit plný text záznamu

Report

Mesh-TensorFlow: Deep Learning for Supercomputers

Autor: Shazeer, Noam, Cheng, Youlong, Parmar, Niki, Tran, Dustin, Vaswani, Ashish, Koanantakool, Penporn, Hawkins, Peter, Lee, HyoukJoong, Hong, Mingsheng, Young, Cliff, Sepassi, Ryan, Hechtman, Blake

Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers fr

Externí odkaz: http://arxiv.org/abs/1811.02084

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání