Zobrazeno 1 - 10
of 45
pro vyhledávání: '"Baykal, Cenk"'
Knowledge distillation with unlabeled examples is a powerful training paradigm for generating compact and lightweight student models in applications where the amount of labeled data is limited but one has access to a large pool of unlabeled data. In
Externí odkaz:
http://arxiv.org/abs/2302.03806
It has been well established that increasing scale in deep transformer networks leads to improved quality and performance. However, this increase in scale often comes with prohibitive increases in compute cost and inference latency. We introduce Alte
Externí odkaz:
http://arxiv.org/abs/2301.13310
One way of introducing sparsity into deep networks is by attaching an external table of parameters that is sparsely looked up at different layers of the network. By storing the bulk of the parameters in the external table, one can increase the capaci
Externí odkaz:
http://arxiv.org/abs/2302.00003
Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it
Externí odkaz:
http://arxiv.org/abs/2210.06711
Distilling knowledge from a large teacher model to a lightweight one is a widely successful approach for generating compact, powerful models in the semi-supervised learning setting where a limited amount of labeled data is available. In large-scale a
Externí odkaz:
http://arxiv.org/abs/2210.01213
Autor:
Baykal, Cenk
We present sampling-based algorithms with provable guarantees to alleviate the increasingly prohibitive costs of training and deploying modern AI systems. At the core of this thesis lies importance sampling, which we use to construct representative s
Deep and wide neural networks successfully fit very complex functions today, but dense models are starting to be prohibitively expensive for inference. To mitigate this, one promising direction is networks that activate a sparse subgraph of the netwo
Externí odkaz:
http://arxiv.org/abs/2208.04461
Graph neural networks have gained prominence due to their excellent performance in many classification and prediction tasks. In particular, they are used for node classification and link prediction which have a wide range of applications in social ne
Externí odkaz:
http://arxiv.org/abs/2202.03621
With the wide-spread availability of complex relational data, semi-supervised node classification in graphs has become a central machine learning problem. Graph neural networks are a recent class of easy-to-train and accurate methods for this problem
Externí odkaz:
http://arxiv.org/abs/2106.03033
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training (i.e., active learning). By formulating the active learning problem as the prediction with sleeping experts problem, we provide a reg
Externí odkaz:
http://arxiv.org/abs/2104.02822