Zobrazeno 1 - 10
of 19
pro vyhledávání: '"Yen, Ian E. H."'
Autor:
Huang, Shaoyi, Xu, Dongkuan, Yen, Ian E. H., Wang, Yijue, Chang, Sung-en, Li, Bingbing, Chen, Shiyang, Xie, Mimi, Rajasekaran, Sanguthevar, Liu, Hang, Ding, Caiwen
Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit. However, under the trending pretrain-and-finetune paradigm, we postulate a coun
Externí odkaz:
http://arxiv.org/abs/2110.08190
Transformer-based pre-trained language models have significantly improved the performance of various natural language processing (NLP) tasks in the recent years. While effective and prevalent, these models are usually prohibitively large for resource
Externí odkaz:
http://arxiv.org/abs/2104.08682
Autor:
Paria, Biswajit, Yeh, Chih-Kuan, Yen, Ian E. H., Xu, Ning, Ravikumar, Pradeep, Póczos, Barnabás
Deep representation learning has become one of the most widely adopted approaches for visual search, recommendation, and identification. Retrieval of such representations from a large database is however computationally challenging. Approximate metho
Externí odkaz:
http://arxiv.org/abs/2004.05665
We propose to explain the predictions of a deep neural network, by pointing to the set of what we call representer points in the training set, for a given test point prediction. Specifically, we show that we can decompose the pre-activation predictio
Externí odkaz:
http://arxiv.org/abs/1811.09720
Autor:
Wu, Lingfei, Yen, Ian E. H., Xu, Kun, Xu, Fangli, Balakrishnan, Avinash, Chen, Pin-Yu, Ravikumar, Pradeep, Witbrock, Michael J.
While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a
Externí odkaz:
http://arxiv.org/abs/1811.01713
Tensor decomposition has been extensively used as a tool for exploratory analysis. Motivated by neuroscience applications, we study tensor decomposition with Boolean factors. The resulting optimization problem is challenging due to the non-convex obj
Externí odkaz:
http://arxiv.org/abs/1810.04754
Kernel method has been developed as one of the standard approaches for nonlinear learning, which however, does not scale to large data set due to its quadratic complexity in the number of samples. A number of kernel approximation methods have thus be
Externí odkaz:
http://arxiv.org/abs/1809.05247
We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute. A particular instance of interest is the L1-regularized MLE for le
Externí odkaz:
http://arxiv.org/abs/1406.7321
Autor:
Yen IEH; Carnegie Mellon University, U.S.A., Lee WC; National Taiwan University, Taiwan., Chang SE; National Taiwan University, Taiwan., Suggala AS; Carnegie Mellon University, U.S.A., Lin SD; National Taiwan University, Taiwan., Ravikumar P; Carnegie Mellon University, U.S.A.
Publikováno v:
Proceedings of machine learning research [Proc Mach Learn Res] 2017 Aug; Vol. 70, pp. 3949-3957.
Autor:
Zhang J; University of Texas at Austin., Yen IEH; Carnegie Mellon University., Ravikumar P; Carnegie Mellon University., Dhillon IS; University of Texas at Austin.
Publikováno v:
JMLR workshop and conference proceedings [JMLR Workshop Conf Proc] 2017 Apr; Vol. 54, pp. 1514-1522.