Výsledky vyhledávání - "Luo, Mingshuang"

Report

M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation

Autor: Luo, Mingshuang, Hou, Ruibing, Li, Zhuo, Chang, Hong, Liu, Zimo, Wang, Yaowei, Shan, Shiguang

This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three fundamental principles. The first focuses on creating a unified representat

Externí odkaz: http://arxiv.org/abs/2405.16273

Zobrazit plný text záznamu

Report

Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation

Autor: Guo, Liyong, Yang, Xiaoyu, Wang, Quandong, Kong, Yuxiang, Yao, Zengwei, Cui, Fan, Kuang, Fangjun, Kang, Wei, Lin, Long, Luo, Mingshuang, Zelasko, Piotr, Povey, Daniel

Knowledge distillation(KD) is a common approach to improve model performance in automatic speech recognition (ASR), where a student model is trained to imitate the output behaviour of a teacher model. However, traditional KD methods suffer from teach

Externí odkaz: http://arxiv.org/abs/2211.00508

Zobrazit plný text záznamu

Report

Fast and parallel decoding for transducer

Autor: Kang, Wei, Guo, Liyong, Kuang, Fangjun, Lin, Long, Luo, Mingshuang, Yao, Zengwei, Yang, Xiaoyu, Żelasko, Piotr, Povey, Daniel

The transducer architecture is becoming increasingly popular in the field of speech recognition, because it is naturally streaming as well as high in accuracy. One of the drawbacks of transducer is that it is difficult to decode in a fast and paralle

Externí odkaz: http://arxiv.org/abs/2211.00484

Zobrazit plný text záznamu

Report

Pruned RNN-T for fast, memory-efficient ASR training

Autor: Kuang, Fangjun, Guo, Liyong, Kang, Wei, Lin, Long, Luo, Mingshuang, Yao, Zengwei, Povey, Daniel

The RNN-Transducer (RNN-T) framework for speech recognition has been growing in popularity, particularly for deployed real-time ASR systems, because it combines high accuracy with naturally streaming recognition. One of the drawbacks of RNN-T is that

Externí odkaz: http://arxiv.org/abs/2206.13236

Zobrazit plný text záznamu

Akademický článek

CELA-MFP: a contrast-enhanced and label-adaptive framework for multi-functional therapeutic peptides prediction.

Autor: Fang, Yitian^1,2 (AUTHOR), Luo, Mingshuang² (AUTHOR) dqwei@sjtu.edu.cn, Ren, Zhixiang² (AUTHOR), Wei, Leyi^3,4 (AUTHOR) dqwei@sjtu.edu.cn, Wei, Dong-Qing^1,2 (AUTHOR) dqwei@sjtu.edu.cn

Publikováno v: Briefings in Bioinformatics. Jul2024, Vol. 25 Issue 4, p1-12. 12p.

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

Synchronous Bidirectional Learning for Multilingual Lip Reading

Autor: Luo, Mingshuang, Yang, Shuang, Chen, Xilin, Liu, Zitao, Shan, Shiguang

Lip reading has received increasing attention in recent years. This paper focuses on the synergy of multilingual lip reading. There are about as many as 7000 languages in the world, which implies that it is impractical to train separate lip reading m

Externí odkaz: http://arxiv.org/abs/2005.03846

Zobrazit plný text záznamu

Report

Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading

Autor: Luo, Mingshuang, Yang, Shuang, Shan, Shiguang, Chen, Xilin

Lip-reading aims to infer the speech content from the lip movement sequence and can be seen as a typical sequence-to-sequence (seq2seq) problem which translates the input image sequence of lip movements to the text sequence of the speech content. How

Externí odkaz: http://arxiv.org/abs/2003.03983

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation

Autor: Guo, Liyong, Yang, Xiaoyu, Wang, Quandong, Kong, Yuxiang, Yao, Zengwei, Cui, Fan, Kuang, Fangjun, Kang, Wei, Lin, Long, Luo, Mingshuang, Zelasko, Piotr, Povey, Daniel

Publikováno v: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a0d0728f74059eb8da3d228579ab5f1a
https://doi.org/10.1109/icassp49357.2023.10095268

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání