Zobrazeno 1 - 10
of 5 508
pro vyhledávání: '"word spotting"'
Speech has emerged as a widely embraced user interface across diverse applications. However, for individuals with dysarthria, the inherent variability in their speech poses significant challenges. This paper presents an end-to-end Pretrain-based Dual
Externí odkaz:
http://arxiv.org/abs/2409.10076
For the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting (LRDWWS) Challenge, we introduce the PB-LRDWWS system. This system combines a dysarthric speech content feature extractor for prototype construction with a prototype-based classification
Externí odkaz:
http://arxiv.org/abs/2409.04799
In recent years, neural network-based Wake Word Spotting achieves good performance on clean audio samples but struggles in noisy environments. Audio-Visual Wake Word Spotting (AVWWS) receives lots of attention because visual lip movement information
Externí odkaz:
http://arxiv.org/abs/2403.01700
Autor:
Sahai, Saumya Y., Liu, Jing, Muniyappa, Thejaswi, Sathyendra, Kanthashree M., Alexandridis, Anastasios, Strimel, Grant P., McGowan, Ross, Rastrow, Ariya, Chang, Feng-Ju, Mouchtaris, Athanasios, Kunzmann, Siegfried
We present dual-attention neural biasing, an architecture designed to boost Wake Words (WW) recognition and improve inference time latency on speech recognition tasks. This architecture enables a dynamic switch for its runtime compute paths by exploi
Externí odkaz:
http://arxiv.org/abs/2304.01905
This paper further explores our previous wake word spotting system ranked 2-nd in Track 1 of the MISP Challenge 2021. First, we investigate a robust unimodal approach based on 3D and 2D convolution and adopt the simple attention module (SimAM) for ou
Externí odkaz:
http://arxiv.org/abs/2303.02348
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Xu, Yanguang, Sun, Jianwei, Han, Yang, Zhao, Shuaijiang, Mei, Chaoyang, Guo, Tingwei, Zhou, Shuran, Xie, Chuandong, Zou, Wei, Li, Xiangang
This paper presents the details of our system designed for the Task 1 of Multimodal Information Based Speech Processing (MISP) Challenge 2021. The purpose of Task 1 is to leverage both audio and video information to improve the environmental robustne
Externí odkaz:
http://arxiv.org/abs/2204.08686
Audio-only-based wake word spotting (WWS) is challenging under noisy conditions due to environmental interference in signal transmission. In this paper, we investigate on designing a compact audio-visual WWS system by utilizing visual information to
Externí odkaz:
http://arxiv.org/abs/2202.08509
Publikováno v:
In Pattern Recognition Letters October 2023 174:39-45
Publikováno v:
CAAI Transactions on Intelligence Technology, Vol 8, Iss 4, Pp 1578-1589 (2023)
Abstract Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance. However, most audio‐visual wake word spot
Externí odkaz:
https://doaj.org/article/10ccc642cfc3470b83b7d792a82f0570