Zobrazeno 1 - 10
of 140
pro vyhledávání: '"Panigrahy, Rina"'
Deep networks typically learn concepts via classifiers, which involves setting up a model and training it via gradient descent to fit the concept-labeled data. We will argue instead that learning a concept could be done by looking at its moment stati
Externí odkaz:
http://arxiv.org/abs/2310.12143
It has been well established that increasing scale in deep transformer networks leads to improved quality and performance. However, this increase in scale often comes with prohibitive increases in compute cost and inference latency. We introduce Alte
Externí odkaz:
http://arxiv.org/abs/2301.13310
One way of introducing sparsity into deep networks is by attaching an external table of parameters that is sparsely looked up at different layers of the network. By storing the bulk of the parameters in the external table, one can increase the capaci
Externí odkaz:
http://arxiv.org/abs/2302.00003
Deep and wide neural networks successfully fit very complex functions today, but dense models are starting to be prohibitively expensive for inference. To mitigate this, one promising direction is networks that activate a sparse subgraph of the netwo
Externí odkaz:
http://arxiv.org/abs/2208.04461
Autor:
Ding, Shaojin, Wang, Weiran, Zhao, Ding, Sainath, Tara N., He, Yanzhang, David, Robert, Botros, Rami, Wang, Xin, Panigrahy, Rina, Liang, Qiao, Hwang, Dongseong, McGraw, Ian, Prabhavalkar, Rohit, Strohman, Trevor
In this paper, we propose a dynamic cascaded encoder Automatic Speech Recognition (ASR) model, which unifies models for different deployment scenarios. Moreover, the model can significantly reduce model size and power consumption without loss of qual
Externí odkaz:
http://arxiv.org/abs/2204.06164
We propose a modular architecture for the lifelong learning of hierarchically structured tasks. Specifically, we prove that our architecture is theoretically able to learn tasks that can be solved by functions that are learnable given access to funct
Externí odkaz:
http://arxiv.org/abs/2112.10919
Autor:
Agarwala, Atish, Das, Abhimanyu, Juba, Brendan, Panigrahy, Rina, Sharan, Vatsal, Wang, Xin, Zhang, Qiuyi
Can deep learning solve multiple tasks simultaneously, even when they are unrelated and very different? We investigate how the representations of the underlying tasks affect the ability of a single neural network to learn them jointly. We present the
Externí odkaz:
http://arxiv.org/abs/2103.15261
It is well established that training deep neural networks gives useful representations that capture essential features of the inputs. However, these representations are poorly understood in theory and practice. In the context of supervised learning a
Externí odkaz:
http://arxiv.org/abs/2103.06875
Large neural network models have been successful in learning functions of importance in many branches of science, including physics, chemistry and biology. Recent theoretical work has shown explicit learning bounds for wide networks and kernel method
Externí odkaz:
http://arxiv.org/abs/2005.07724
Autor:
Panigrahy, Rina
How we store information in our mind has been a major intriguing open question. We approach this question not from a physiological standpoint as to how information is physically stored in the brain, but from a conceptual and algorithm standpoint as t
Externí odkaz:
http://arxiv.org/abs/1910.06718