Zobrazeno 1 - 10
of 38
pro vyhledávání: '"XIULIAN PENG"'
Publikováno v:
ACM Transactions on Multimedia Computing, Communications, and Applications.
We aim to super-resolve text images from unrecognizable low-resolution inputs. Existing super-resolution methods mainly learn a direct mapping from low-resolution to high-resolution images by exploring low-level features, which usually generate blurr
Publikováno v:
Proceedings of the AAAI Conference on Artificial Intelligence. 36:11648-11656
Speech enhancement aims at recovering a clean speech from a noisy input, which can be classified into single speech enhancement and personalized speech enhancement. Personalized speech enhancement usually utilizes the speaker identity extracted from
Publikováno v:
IEEE Signal Processing Letters. 29:967-971
Existing deep learning based speech enhancement (SE) methods either use blind end-to-end training or explicitly incorporate speaker embedding or phonetic information into the SE network to enhance speech quality. In this paper, we perceive speech and
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3450ea650fe274ed5b63151fb61a6c1e
http://arxiv.org/abs/2302.11558
http://arxiv.org/abs/2302.11558
For real-time speech enhancement (SE) including noise suppression, dereverberation and acoustic echo cancellation, the time-variance of the audio signals becomes a severe challenge. The causality and memory usage limit that only the historical inform
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::71f1a81d15ee632cf50c0fc42700fcde
In this paper we propose a multi-modal multi-correlation learning framework targeting at the task of audio-visual speech separation. Although previous efforts have been extensively put on combining audio and visual modalities, most of them solely ado
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7570676379c0fcb223e9533fa6c98b47
http://arxiv.org/abs/2207.01197
http://arxiv.org/abs/2207.01197
Neural audio/speech coding has recently demonstrated its capability to deliver high quality at much lower bitrates than traditional methods. However, existing neural audio/speech codecs employ either acoustic features or learned blind features with a
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8d685579b284afd959c4dd02fba9e241
Publikováno v:
Journal of Visual Communication and Image Representation. 60:426-440
This paper presents a video coding scheme tailored for traffic surveillance videos, which features a pre-built library that is utilized in both encoder and decoder to pursue higher compression efficiency. We are motivated by the observation that, in
Publikováno v:
ICASSP
Existing speech enhancement methods mainly separate speech from noises at the signal level or in the time-frequency domain. They seldom pay attention to the semantic information of a corrupted signal. In this paper, we aim to bridge this gap by extra
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::874eff1db3090f2e3eb3569d413e84cd
Publikováno v:
ICME
Redundancy is necessary for a storage system to recover from errors. The frequent errors in large-scale systems, e.g. cloud, make it desired to reduce the recovery cost. Among all kinds of data stored in the cloud, video takes a large portion due to