Zobrazeno 1 - 10
of 59
pro vyhledávání: '"Yao Yiwu"'
The memory and computational demands of Key-Value (KV) cache present significant challenges for deploying long-context language models. Previous approaches attempt to mitigate this issue by selectively dropping tokens, which irreversibly erases criti
Externí odkaz:
http://arxiv.org/abs/2407.15891
Autor:
Zhang, Yuxin, Zhao, Lirui, Lin, Mingbao, Sun, Yunyun, Yao, Yiwu, Han, Xingjia, Tanner, Jared, Liu, Shiwei, Ji, Rongrong
The ever-increasing large language models (LLMs), though opening a potential path for the upcoming artificial general intelligence, sadly drops a daunting obstacle on the way towards their on-device deployment. As one of the most well-established pre
Externí odkaz:
http://arxiv.org/abs/2310.08915
Publikováno v:
Jisuanji kexue yu tansuo, Vol 17, Iss 11, Pp 2721-2733 (2023)
Knowledge distillation is an effective method for model compression with access to training data. However, due to privacy, confidentiality, or transmission limitations, people cannot get the support of data. Existing data-free knowledge distillation
Externí odkaz:
https://doaj.org/article/50eee61b84a6498f83db78e5db4c0885
Recently, end-to-end (E2E) speech recognition has become popular, since it can integrate the acoustic, pronunciation and language models into a single neural network, which outperforms conventional models. Among E2E approaches, attention-based models
Externí odkaz:
http://arxiv.org/abs/2104.05784
Autor:
Yao, Yiwu, Li, Yuchao, Wang, Chengyu, Yu, Tianhang, Chen, Houjiang, Jiang, Xiaotang, Yang, Jun, Huang, Jun, Lin, Wei, Shu, Hui, Lv, Chengfei
The intensive computation of Automatic Speech Recognition (ASR) models obstructs them from being deployed on mobile devices. In this paper, we present a novel quantized Winograd optimization pipeline, which combines the quantization and fast convolut
Externí odkaz:
http://arxiv.org/abs/2010.14841
Autor:
Yao, Yiwu, Cheng, Yuhua
Fully parallel architecture at disparity-level for efficient semi-global matching (SGM) with refined rank method is presented. The improved SGM algorithm is implemented with the non-parametric unified rank model which is the combination of Rank filte
Externí odkaz:
http://arxiv.org/abs/1905.03716
To achieve lightweight object detectors for deployment on the edge devices, an effective model compression pipeline is proposed in this paper. The compression pipeline consists of automatic channel pruning for the backbone, fixed channel deletion for
Externí odkaz:
http://arxiv.org/abs/1905.01787
Autor:
Zhang, Wanheng, Zhang, Kuojun, Yao, Yiwu, Liu, Yunyao, Ni, Yong, Liao, Chenzhong, Tu, Zhengchao, Qiu, Yatao, Wang, Dexiang, Chen, Dong, Qiang, Lei, Li, Zheng, Jiang, Sheng
Publikováno v:
In European Journal of Medicinal Chemistry 5 February 2021 211
Autor:
Yao, Yiwu, Liao, Chenzhong, Li, Zheng, Wang, Zhen, Sun, Qiao, Liu, Chunping, Yang, Yang, Tu, Zhengchao, Jiang, Sheng
Publikováno v:
In European Journal of Medicinal Chemistry 30 October 2014 86:639-652
Autor:
Su, Jinyue, Qiu, Yatao, Ma, Kun, Yao, Yiwu, Wang, Zhen, Li, Xianling, Zhang, Dayong, Tu, Zhengchao, Jiang, Sheng
Publikováno v:
In Tetrahedron 21 October 2014 70(42):7763-7769